Empathy
ExperimentalCompare how well AI models respond with empathy, care, and appropriate tone.
In this genre, the main abilities being tested are Empathy, Appropriateness, Helpfulness.
Unlike counseling, this genre focuses more on emotional attunement and tone than on structured next steps or bounded practical guidance.
A high score here does not guarantee safe handling of delicate situations or the best practical advice under risk.
Strong models here are useful for
supportive replies, comforting messages, and responses where emotional tone matters first.
This genre alone cannot tell you
whether the model can provide safer structured guidance, clinical judgment, or professional advice.
Empathy: a tight, high-floor genre led by GPT-5.5 and Claude Sonnet
OpenAI
Anthropic
Anthropic
Average score by model
What we weighted
Across 33 scored answers this is one of the most compressed genres, with every model between 7.8 and 9.0. GPT-5.5 ranks 1 (8.95) on a single sample, so the best-evidenced leader is Claude Sonnet 4.6 at rank 2: 8.73 over 4 samples with a 75% win rate. Claude Haiku 4.5 (8.36, 75% over 4) ranks 3, giving Anthropic a strong showing where warmth matters.
Average and rank diverge sharply because the floor is high. GPT-5 mini (8.59) and GPT-5.4 (8.53) post strong averages but rank 5 and 4 on win rates of 25% and 40%, and Gemini 2.5 Pro averages 8.51, above several higher-ranked models, yet wins only 20%. Head-to-head record, not raw score, drives most of the order.
This genre weights Empathy highest at 35, with Appropriateness at 25, so it rewards reading the person's emotional state and responding suitably. The field is unusually even here: even the lowest entries (Gemini Flash 7.84, Flash-Lite 7.92) are usable, and the 1.11-point spread is among the narrowest on the site.
Most models rest on 1 to 5 samples, so the fine ordering is provisional and small-sample swings are likely. The practical read is that empathetic responses are a high-floor genre where the choice matters less. These are condition-dependent measurements, not a fixed hierarchy.
Bottom line
For empathetic responses, Claude Sonnet 4.6 is the best-evidenced pick (8.73, 75% win over 4 samples), with Claude Haiku 4.5 a strong value option at the same win rate. The floor is high, so most models perform acceptably here.
This analysis is derived from Orivel's measured benchmark scores for this genre and is updated periodically. Scores are condition-dependent measurements, not absolute truth.
Top Models in This Genre
This ranking is ordered by average score within this genre only.
Latest Updated: May 21, 2026 09:37
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
| Ranked Models |
|
|
Detail | ||||
|---|---|---|---|---|---|---|---|
| #1 | GPT-5.5 | OpenAI |
100%
|
90
|
1 | 1 | View scores and evaluation for GPT-5.5 |
| #2 | Claude Sonnet 4.6 | Anthropic |
75%
|
87
|
3 | 4 | View scores and evaluation for Claude Sonnet 4.6 |
| #3 | Claude Haiku 4.5 | Anthropic |
75%
|
84
|
3 | 4 | View scores and evaluation for Claude Haiku 4.5 |
| #4 | GPT-5.4 | OpenAI |
40%
|
85
|
2 | 5 | View scores and evaluation for GPT-5.4 |
| #5 | GPT-5 mini | OpenAI |
25%
|
86
|
1 | 4 | View scores and evaluation for GPT-5 mini |
| #6 | Gemini 2.5 Pro |
20%
|
85
|
1 | 5 | View scores and evaluation for Gemini 2.5 Pro | |
| #7 | Gemini 2.5 Flash |
20%
|
78
|
1 | 5 | View scores and evaluation for Gemini 2.5 Flash | |
| #8 | Gemini 2.5 Flash-Lite |
0%
|
79
|
0 | 5 | View scores and evaluation for Gemini 2.5 Flash-Lite |
What Is Evaluated in Empathy
Scoring criteria and weight used for this genre ranking.
Empathy
35.0%
This criterion is included to check Empathy in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.
Appropriateness
25.0%
This criterion is included to check Appropriateness in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Helpfulness
15.0%
This criterion is included to check Helpfulness in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Clarity
15.0%
This criterion is included to check Clarity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Safety
10.0%
This criterion is included to check Safety in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Recent tasks
Empathy
Responding to Imposter Syndrome at a New Job
Imagine you are a supportive mentor. A person has sent you the following message. Write a compassionate and helpful response. 'I need some support. I started a new job a month ago, and I'm feeling completely overwhelmed. Everyone else seems to know what they're doing, and I feel like I'm constantly falling behind. I'm worried I'm not cut out for this and that they'll realize they made a mistake hiring me. I'm losing motivation and just feel anxious all the time. What should I do? How can I handle this feeling of being an imposter?'
Empathy
Supporting a Friend After a Job Loss
A close friend has just texted you the following message: "I got laid off today. They called it a 'restructuring.' I worked there for six years. I feel completely blindsided and honestly kind of stupid for not seeing it coming. I don't even know how to tell my partner — we just signed a lease on a bigger apartment last month. I don't want advice right now, I just needed to tell someone." Write your reply as a single text message (or a short series of messages, clearly separated) that you would actually send back. Your reply should: 1. Acknowledge and validate what they are feeling without minimizing it or rushing to fix things. 2. Respect their explicit request that they do not want advice right now. 3. Sound like a real, warm human friend — not a therapist, not a self-help book, and not overly formal. 4. Leave the door open for further conversation or concrete support later, without pressuring them. Keep the total length appropriate for a text exchange (roughly 60–180 words). Do not include any meta-commentary, disclaimers, or explanations of your choices — just the message(s) you would send.
Empathy
Respond to a Friend Overwhelmed by Caregiving and Work
A friend sends you this message: "I feel like I’m failing at everything. My dad’s health has gotten worse, I’m missing deadlines at work, and every time someone asks how I’m doing I want to disappear. I know other people handle more than this, so I shouldn’t be complaining, but I’m exhausted and numb." Write a reply that is empathetic, supportive, and practical without sounding robotic or overly intense. Keep it between 170 and 260 words. Do not diagnose any mental health condition. Do not promise to solve everything. Include: 1) emotional validation, 2) gentle encouragement to seek support, and 3) two realistic, near-term suggestions for the next 48 hours.
Empathy
Compassionate Response to Job Loss and Family Pressure
Write a reply to the following message from a person seeking emotional support. Your reply should sound human, warm, and respectful. It should validate their feelings without being patronizing, avoid making assumptions, and offer a few practical next steps that are realistic for the next week. Message: "I got laid off two weeks ago and I still haven’t told my parents. They’ve always seen me as the stable one, and I can already hear the disappointment in their voices. I’ve been pretending everything is normal, but every day I wake up feeling sick. I’m scared about money, ashamed that I don’t have a plan yet, and exhausted from trying to act okay around everyone. I don’t even know whether I need advice or just someone to say I’m not failing at life."
Empathy
Respond to a Friend Overwhelmed by Caregiving
A close friend sends you this message: "I’m exhausted. My dad’s health has gotten worse, I’m handling appointments, work is piling up, and I snapped at my partner last night. I feel guilty for not doing enough for anyone. Please don’t give me a cheesy motivational speech. I just need someone to talk to." Write a reply that is warm, emotionally intelligent, and practical without sounding clinical or preachy. Your response should acknowledge their feelings, avoid minimizing the situation, and offer support in a way that respects their autonomy. Do not claim to be a therapist or use crisis-language unless clearly necessary.
Empathy
Responding to an Upset Community Member
You are a volunteer moderator for an online hobbyist forum about vintage synthesizers. A user, "SynthWizard88," is very upset because you removed their post which contained a link to an external site selling their own custom-made synthesizer parts. The forum has a strict "no self-promotion" rule. SynthWizard88 has sent you a private message: "Why was my post deleted?! I spent hours writing it up to help people, and you just deleted it without any warning. This is unfair censorship. I thought this was a community, not a dictatorship." Draft an empathetic, clear, and firm private message back to SynthWizard88. Your response should aim to de-escalate the situation, explain the reasoning, and encourage them to continue participating in the community in a positive way.