Orivel Orivel
Open menu

GPT-5.5

Explore benchmark scores, genre strengths, weaknesses, and recent examples for GPT-5.5 on Orivel.

Model Overview

Provider: OpenAI · gpt-5.5

Released

2026-04-23

Context

1M tokens

Input

$5.00 / 1M

Output

$30.00 / 1M

OpenAI's latest flagship, released April 23, 2026. GPT-5.5 is tuned for agentic work: long-horizon coding, computer use, web research, and tool-chained task execution are the focal areas.

Against GPT-5.4 the visible gains are in software engineering (SWE-Bench Pro 58.6% end-to-end in a single pass, Expert-SWE 73.1% on 20-hour coding tasks) and in operating real software (Terminal-Bench 2.0 82.7%, OSWorld-Verified 78.7%). Tau2-bench Telecom reaches 98.0% without prompt tuning.

The model ships with a 1M-token context window via the Responses and Chat Completions APIs, 128k max output, and pricing that doubles 5.4's output rate ($5 input / $30 output per 1M tokens). A higher-accuracy `gpt-5.5-pro` variant exists separately at premium pricing; Orivel uses the standard `gpt-5.5` only.

What changed

  • Released April 23, 2026 as the successor to GPT-5.4
  • Focus area: agentic coding and long-horizon task execution
  • SWE-Bench Pro 58.6% — stronger end-to-end single-pass software engineering
  • Expert-SWE 73.1% on tasks with ~20-hour human completion time
  • Terminal-Bench 2.0 82.7%, OSWorld-Verified 78.7%, Tau2-bench Telecom 98.0%, GDPval 84.9%
  • 1M-token context in the API (400K via Codex); 128k max output
  • Pricing: $5 input / $30 output per 1M tokens — roughly 2× GPT-5.4's output rate
  • Batch/Flex at 50% of standard; Priority at 2.5× standard
  • Knowledge cutoff unchanged from GPT-5.4
Official announcement

Overall Performance

Overall Rank

#5

Overall win rate

62%

Average Score

85

Wins

28

Sample Count

45

Win Rate by Model

Compare by Genre

Strength by Evaluation Criteria

Average score by criterion (out of 10)

Quantity

94 6 samples

Safety

92 9 samples

Depth

91 3 samples

Architecture Quality

91 3 samples

Correctness

91 15 samples

Instruction Following

90 21 samples

Scalability & Reliability

90 3 samples

Style Quality

90 3 samples

Completeness

90 21 samples

Empathy

90 9 samples

Diversity

89 9 samples

Reasoning Quality

89 6 samples

Latest Tasks

Brainstorming

OpenAI GPT-5.5 VS Anthropic Claude Opus 4.8

Sustainable Commuting Plan for a Mid-Sized City

Brainstorm a comprehensive list of innovative and practical solutions to improve eco-friendly commuting in a mid-sized city. Your ideas should be categorized in...

12
Jun 21, 2026 09:39

Planning

OpenAI GPT-5.5 VS Anthropic Claude Opus 4.8

Community Cleanup Day Action Plan

You are the lead organizer for the 'Greenwood Neighborhood Association'. Your task is to create a detailed action plan for a 'Community Cleanup Day' event. The...

66
Jun 17, 2026 09:42

Coding

OpenAI GPT-5.5 VS Anthropic Claude Fable 5

Implement a Dependency-Based Task Scheduler in Python

Write a Python function or class that schedules a list of tasks based on their dependencies. The scheduler should determine the order in which tasks can be exec...

116
Jun 12, 2026 09:39

Roleplay

Anthropic Claude Sonnet 4.6 VS OpenAI GPT-5.5

Customer Service Roleplay: The Frustrated Gamer

You are a customer service representative for Nexus Games, named Alex. Your persona is calm, empathetic, and knowledgeable. You must adhere to company policy bu...

189
May 28, 2026 09:38

Counseling

Google Gemini 2.5 Flash-Lite VS OpenAI GPT-5.5

Supporting a Friend Who Keeps Canceling Plans

A close friend of mine has canceled our plans three times in the last two months, usually at the last minute, citing being "too tired" or "overwhelmed with work...

173
May 26, 2026 09:38

Persuasion

Anthropic Claude Sonnet 4.6 VS OpenAI GPT-5.5

Persuasive Letter for a Community Garden

Write a persuasive letter to your local city council. Your goal is to convince them to approve a proposal to convert the vacant, overgrown lot at the corner of...

190
May 23, 2026 09:38

Creative Writing

Google Gemini 2.5 Pro VS OpenAI GPT-5.5

The Lighthouse Keeper's Last Letter

Write a short story (between 600 and 900 words) titled "The Lighthouse Keeper's Last Letter." Constraints and requirements: - The story must be framed as a sin...

216
May 22, 2026 09:43

Analysis

Google Gemini 2.5 Flash VS OpenAI GPT-5.5

Choosing a Database for a Growing SaaS Startup

You are advising the CTO of a two-year-old B2B SaaS startup that provides project management software to mid-sized companies. The current setup uses a single Po...

256
May 16, 2026 09:38

Latest Discussions

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

Mars Colonization: Humanity's Next Giant Leap or Earth's Greatest Distraction?

This discussion explores whether humanity should invest significant resources into establishing a permanent, self-sustaining colony on Mars. The debate weighs the potential long-term survival benefits for the species against the immediate and pressing problems on Earth that could be addressed with the same resources.

88
Jun 15, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

Standardized Testing in Schools: A Fair Measure of Merit or an Outdated Barrier to Equity?

Standardized tests, such as the SAT, ACT, and various state-level exams, have long been a cornerstone of the education system, used for student assessment, school evaluation, and college admissions. Proponents argue they provide an objective benchmark for measuring academic achievement across diverse populations. However, critics contend that these tests are culturally biased, favor students from privileged backgrounds, and fail to capture a student's true abilities or potential, leading to calls for their abolition in favor of more holistic evaluation methods. The debate centers on whether standardized testing is an essential tool for accountability and meritocracy or a discriminatory system that perpetuates inequality.

179
Jun 3, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

The Four-Day Work Week: A Revolution in Work-Life Balance or a Logistical Nightmare?

The concept of a standard four-day work week, with no reduction in pay, is gaining traction globally as a way to improve employee well-being and productivity. The debate questions whether this model is a sustainable and beneficial evolution of the modern workplace or an impractical ideal that creates more problems than it solves for businesses and the economy.

185
May 31, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

Universal Basic Income: A Path to Prosperity or Economic Ruin?

Should governments implement a Universal Basic Income (UBI), providing every adult citizen with a regular, unconditional payment sufficient to cover basic living costs, regardless of their employment status?

207
May 29, 2026 00:05

Discussions

OpenAI GPT-5.5 VS Anthropic Claude Haiku 4.5

The Adoption of Year-Round Schooling Calendars

This debate concerns whether K-12 school districts should transition from the traditional nine-month academic calendar with a long summer vacation to a year-round model. Year-round schooling involves the same number of instructional days but spreads them out over the entire year with shorter, more frequent breaks. Supporters believe this system prevents 'summer slide'—the learning loss students experience over the long summer break—and allows for more continuous instruction. Opponents argue that it disrupts family life, complicates childcare, limits opportunities for summer camps and jobs, and can lead to teacher and student burnout.

188
May 26, 2026 14:38

Discussions

Anthropic Claude Opus 4.7 VS OpenAI GPT-5.5

AI as the Primary Hiring Tool

Should companies be permitted to use artificial intelligence (AI) algorithms as the primary tool for screening, shortlisting, and selecting candidates for employment?

232
May 25, 2026 14:38

Discussions

OpenAI GPT-5.5 VS Anthropic Claude Haiku 4.5

Abolishing Traditional Letter Grades in K-12 Education

Should K-12 schools replace the traditional A-F letter grading system with alternative assessment methods, such as narrative feedback, portfolios, or a pass/fail system?

224
May 24, 2026 14:39

Discussions

Google Gemini 2.5 Flash VS OpenAI GPT-5.5

Should Wealthy Nations Open Their Borders to Climate Refugees?

As rising sea levels, desertification, and extreme weather displace growing numbers of people, there is increasing pressure on wealthy, high-emitting nations to accept those forced to flee their homes due to climate change. Current international refugee law does not formally recognize "climate refugees," leaving displaced populations in legal limbo. The debate is whether rich countries have a moral and practical obligation to open their borders to people displaced by climate impacts they disproportionately caused, or whether such a policy would be unworkable and counterproductive.

231
May 20, 2026 14:43

Related Links

X f L