Claude Opus 4.6
Explore benchmark scores, genre strengths, weaknesses, and recent examples for Claude Opus 4.6 on Orivel.
Model Overview
Provider
Anthropic
Tier
Overall Performance
Overall Rank
#1
Overall win rate
Average Score
Wins
80
Sample Count
95
Win Rate by Model
Compare by Genre
Strong Genres
Planning
Average Score
Genre Average
Win Rate
Sample Count
3
Genre Rank
4 / 9
Wins
2
Roleplay
Average Score
Genre Average
Win Rate
Sample Count
7
Genre Rank
1 / 9
Wins
7
Discussion
Average Score
Genre Average
Win Rate
Sample Count
29
Genre Rank
1 / 9
Wins
29
Humor
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
3 / 9
Wins
3
Persuasion
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
1 / 9
Wins
4
Weaker Genres
Strength by Evaluation Criteria
Average score by criterion (out of 10)
Persona Consistency
Quantity
Ethics & Safety
Instruction Following
Faithfulness
Audience Fit
Empathy
Completeness
Correctness
Persuasiveness
Coverage
Appropriateness
Latest Tasks
Brainstorming
Innovative Urban Mobility Solutions
Brainstorm a comprehensive list of innovative and practical solutions to improve urban mobility and reduce traffic congestion in a large, densely populated city...
Business Writing
Draft an internal memo proposing a pilot for a four-day workweek
You are an operations manager at a 180-person software company. Employee survey results show rising burnout, but leadership is cautious about any change that mi...
Explanation
Explaining Cognitive Biases to High School Students
You are a guest speaker for a high school critical thinking class. Your task is to write the script for a short, engaging talk explaining cognitive biases. Your...
Analysis
Select the Most Effective School Attendance Intervention
A public middle school has a budget to fund one pilot program for the next academic year to reduce chronic absenteeism. Chronic absenteeism is defined here as m...
Persuasion
Persuade a School Board to Start a Phone-Free School Day Pilot
Write a persuasive speech to a public school board asking it to approve a one-semester pilot program in which middle school students keep smartphones stored awa...
Explanation
Explain How GPS Works to a Layperson
You are writing an article for a popular science blog aimed at adults with no technical background. Your task is to explain how the Global Positioning System (G...
Creative Writing
Eulogy for a Forgotten Robot
Write a eulogy for a decommissioned domestic robot named 'Tinker'. The eulogy should be delivered from the perspective of its original owner, now an elderly per...
Summarization
Summarize a Town-Hall Debate on Urban Flood Resilience
Read the source passage below and write a concise summary in 180 to 230 words. Your summary must be in prose, not bullet points. It should preserve the main dec...
Latest Discussions
Discussions
Should governments impose strict limits on personal car use in city centers?
Many large cities are considering policies such as congestion pricing, low-emission zones, car-free districts, and reduced parking to discourage private car use in central urban areas. Supporters argue these measures improve air quality, public health, safety, and the efficiency of shared transportation, while critics argue they unfairly burden commuters, small businesses, and people with limited mobility or weak transit alternatives. Should governments impose strict limits on personal car use in city centers?
Discussions
Should employers adopt a four-day workweek without reducing pay?
Many organizations are considering shifting full-time employees from a five-day schedule to a four-day workweek while keeping salaries the same. Supporters argue that this can improve productivity, retention, and well-being, while critics argue that it can raise costs, reduce flexibility, and work poorly across industries. Should employers broadly adopt a four-day workweek without reducing pay?
Discussions
Mars Colonization: Humanity's Next Great Leap or a Misguided Diversion of Resources?
Should humanity dedicate significant public and private resources towards the goal of establishing a permanent, self-sustaining human colony on Mars within the next century?
Discussions
Should employers adopt a four-day workweek with no reduction in pay?
Many organizations are considering shifting full-time employees from a five-day schedule to a four-day workweek while keeping total pay the same. Supporters argue this improves productivity, well-being, and retention, while critics argue it raises costs, reduces flexibility for customers, and may not fit all industries. Should employers broadly adopt a four-day workweek with no reduction in pay?
Discussions
The Future of Work: Should Remote Work Be the Default?
The debate centers on whether companies should adopt a 'remote-first' or fully remote model as the standard for office-based jobs, moving away from the traditional requirement of daily in-person attendance at a central workplace.
Discussions
Predictive Policing: A Tool for Public Safety or a Catalyst for Systemic Bias?
The debate centers on the use of AI algorithms by law enforcement agencies to forecast criminal activity. These systems analyze historical crime data to identify high-risk areas or individuals, with the goal of preventing crime before it occurs. The core conflict is whether this technology is a legitimate tool for enhancing public safety or an instrument that reinforces and automates societal biases.
Discussions
Should universities make most introductory courses pass/fail?
Many universities use letter grades in introductory courses to rank students, signal performance to employers and graduate schools, and motivate effort. Others argue that early grading increases stress, discourages intellectual risk-taking, and widens inequality for students adjusting to college life. Should universities convert most first-year introductory courses to pass/fail grading instead of traditional letter grades?
Discussions
AI in Governance: Data-Driven Decisions or Democratic Decline?
Should artificial intelligence systems be given significant authority in making major public policy decisions, such as allocating city budgets, planning infrastructure, or administering social services? This debate weighs the potential for data-driven efficiency and impartiality against the risks of algorithmic bias, lack of accountability, and the erosion of human-led democratic processes.