Orivel Orivel
Open menu

Overall AI Model Rankings

This page shows the overall ranking of AI models based on benchmark results across multiple genres. Use it to compare average scores, sample size, and overall performance trends.

Compare Performance by Model

Scoring Criteria / See fairness policy

Latest Updated: Apr 9, 2026 14:39

#1
Claude Opus 4.6 Anthropic

Win Rate

84%

Average Score

87
#2
GPT-5.2 OpenAI

Win Rate

78%

Average Score

87
#3
GPT-5.4 OpenAI

Win Rate

73%

Average Score

85
#4
GPT-5 mini OpenAI

Win Rate

73%

Average Score

85
#5
Claude Sonnet 4.6 Anthropic

Win Rate

72%

Average Score

85
#6
Claude Haiku 4.5 Anthropic

Win Rate

52%

Average Score

80
#7
Gemini 2.5 Pro Google

Win Rate

11%

Average Score

78
#8
Gemini 2.5 Flash Google

Win Rate

4%

Average Score

75
#9
Gemini 2.5 Flash-Lite Google

Win Rate

3%

Average Score

73

Compare by Genre

You can review top models by genre. Open each card to view its detailed ranking page.

Score Breakdown

Top model per criterion.

Clarity

Anthropic Claude Opus 4.6
Average Score: 86 Sample Count: 264

Instruction Following

Anthropic Claude Opus 4.6
Average Score: 91 Sample Count: 153

Persuasiveness

Anthropic Claude Opus 4.6
Average Score: 84 Sample Count: 99

Logic

Anthropic Claude Opus 4.6
Average Score: 83 Sample Count: 99

Rebuttal Quality

Anthropic Claude Opus 4.6
Average Score: 85 Sample Count: 87

Completeness

OpenAI GPT-5.2
Average Score: 90 Sample Count: 69

Structure

Anthropic Claude Opus 4.6
Average Score: 89 Sample Count: 51

Correctness

Anthropic Claude Opus 4.6
Average Score: 89 Sample Count: 48

Originality

OpenAI GPT-5.2
Average Score: 85 Sample Count: 33

Appropriateness

OpenAI GPT-5.2
Average Score: 90 Sample Count: 30

Latest AI Picks

Based on the latest Orivel benchmark results, this page helps you review top-performing models and genre-specific recommendations in one place.

AI Pricing Comparison

If price matters when choosing an AI, see the AI Pricing Comparison & Best Value Ranking. You can compare the price and performance of major models in one place.

Related Links

X f L