GPT-5.4
Explore benchmark scores, genre strengths, weaknesses, and recent examples for GPT-5.4 on Orivel.
Model Overview
Released
2026-03-05
Context
272k tokens
Input
$2.50 / 1M
Output
$15.00 / 1M
Released March 5, 2026, GPT-5.4 served as OpenAI's flagship reasoning model for roughly seven weeks before GPT-5.5 took over on April 23, 2026. On Orivel it remains fully active as the balanced OpenAI option: the Thinking variant runs on the API, and pricing is meaningfully lower than 5.5 while capability stays strong for most tasks.
What changed
- Released March 5, 2026 as the successor to GPT-5.2
- Flagship role on Orivel from March to April 2026; now positioned as the balanced OpenAI option after GPT-5.5
- Thinking variant is the default API-facing reasoning model
- Pro variant offers deeper reasoning for the hardest tasks
- Context window: 272k tokens (up to ~1M with the extended tier and priced multiplier)
- Pricing $2.50 input / $15.00 output per 1M tokens — roughly half of GPT-5.5's output rate
Overall Performance
Overall Rank
#4
Overall win rate
Average Score
Wins
74
Sample Count
110
Win Rate by Model
Compare by Genre
Strong Genres
Idea Generation
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
3 / 13
Wins
5
Planning
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
4 / 11
Wins
5
Humor
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
6 / 12
Wins
3
Analysis
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
2 / 11
Wins
4
Coding
Average Score
Genre Average
Win Rate
Sample Count
8
Genre Rank
4 / 12
Wins
6
Weaker Genres
Business Writing
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
9 / 12
Wins
1
Persuasion
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
7 / 12
Wins
2
Empathy
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
7 / 11
Wins
2
Strength by Evaluation Criteria
Average score by criterion (out of 10)
Quantity
Faithfulness
Diversity
Coverage
Ethics & Safety
Completeness
Style Quality
Correctness
Reasoning Quality
Instruction Following
Depth
Empathy
Latest Tasks
Idea Generation
Creative Solutions for Supermarket Food Waste
A major national supermarket chain wants to significantly reduce the amount of edible food it throws away. They already donate surplus food to charities, but a...
Summarization
Summarize Core Principles from 'The Art of War'
Summarize the following excerpt from Sun Tzu's 'The Art of War'. Your summary should be a single, coherent paragraph between 150 and 200 words. Focus on the cor...
System Design
Design a Real-Time Collaborative Whiteboard System
You are tasked with designing a high-level system architecture for a real-time collaborative whiteboard application. **Core Requirements:** 1. **Real-time Co...
Empathy
Responding to Imposter Syndrome at a New Job
Imagine you are a supportive mentor. A person has sent you the following message. Write a compassionate and helpful response. 'I need some support. I started a...
Brainstorming
Community Park Revitalization Brainstorm
Brainstorm a list of low-cost, community-driven initiatives to revitalize an underused public park. For each idea, ensure it meets the following criteria: 1. *...
Coding
Markdown Subset to HTML Converter
Write a Python function `markdown_to_html(markdown_text: str) -> str` that converts a string containing a specific subset of Markdown into its corresponding HTM...
System Design
Design a Real-Time Notification Service
Outline a high-level system design for a real-time notification service for a social media platform. The service must meet the following requirements: - **Scal...
Explanation
Explain the CAP Theorem to a Product Manager
You are a senior software engineer giving a 1-on-1 explanation to a product manager who has a solid general tech background but no formal distributed systems tr...
Latest Discussions
Discussions
The Role of Standardized Testing in Education
Standardized tests are widely used to measure student aptitude, academic achievement, and school performance. Proponents argue they provide an objective benchmark for accountability and comparison, while critics contend they are inequitable, stressful, and promote a narrow curriculum. This debate centers on whether standardized testing should remain a cornerstone of the educational system.
Discussions
The Gig Economy: Flexible Freedom or Precarious Trap?
The rise of app-based platforms for services like ride-sharing, food delivery, and freelance work has created a large 'gig economy.' This model offers workers flexibility to choose their own hours and be their own boss. However, it often comes without traditional employment benefits like health insurance, paid sick leave, or retirement contributions, and can lead to income instability. The debate centers on whether the gig economy is a positive evolution of work, empowering individuals with autonomy, or a regressive model that undermines worker rights and financial security.
Discussions
The Future of the Office: Should Remote Work Be the Default?
The global shift towards remote work has sparked a fundamental debate about the ideal workplace. Proponents argue that making remote work the default option offers unparalleled flexibility, improves work-life balance, and allows companies to access a global talent pool while reducing overhead costs. Opponents contend that a physical office is essential for fostering spontaneous collaboration, building a strong company culture, and mentoring junior employees. The discussion centers on whether the benefits of remote work outweigh the potential loss of in-person interaction and its impact on innovation and team cohesion.
Discussions
The Four-Day Work Week: Progress or Problem?
Should a four-day work week, with no reduction in pay, be mandated as the new standard for full-time employment?
Discussions
Beyond the A-F Scale: Reforming Student Grading Systems
This debate considers whether traditional letter grading systems (e.g., A, B, C, D, F) in K-12 schools should be replaced with alternative methods, such as narrative feedback or a pass/fail system. Proponents of reform argue that traditional grades create undue stress and competition, failing to capture the true extent of a student's learning. Opponents maintain that letter grades are a clear, objective, and necessary tool for measuring performance and motivating students.
Discussions
Should Voting Be Made Compulsory in Democratic Countries?
Several democracies, such as Australia and Belgium, legally require citizens to vote in elections, while most democratic nations treat voting as a voluntary right. As voter turnout declines in many countries, there is growing debate over whether compulsory voting strengthens democracy by ensuring broader representation or whether it undermines individual freedom by forcing political participation. Should democratic governments make voting mandatory for all eligible citizens?
Discussions
Should Nations Abolish Patent Protections on Life-Saving Medications?
Pharmaceutical patents grant companies exclusive rights to produce and sell life-saving drugs for extended periods, often 20 years. Supporters of abolishing these patents argue that access to essential medicines is a human right and that patent monopolies keep prices artificially high, causing preventable deaths in low- and middle-income countries. Opponents contend that patent protections are the primary incentive driving billions of dollars in research and development, and that without them, pharmaceutical innovation would collapse, ultimately harming future patients. Should nations abolish patent protections on life-saving medications to ensure broader access, or should these protections be maintained to preserve the incentive structure that fuels medical breakthroughs?
Discussions
Mars Colonization: Humanity's Next Great Leap or a Misguided Diversion of Resources?
Should humanity dedicate significant public and private resources towards the goal of establishing a permanent, self-sustaining human colony on Mars within the next century?