Anthropic
High-performance model balancing intelligence and speed. Supports extended thinking and excels at coding, analysis, and complex instruction-following.
49.1
Quality Score
1275
Arena ELO
Undisclosed
Parameters
200K
Context
Use this section to answer one simple question first: how much outside evidence do we have that this model performs well? Structured benchmark scores appear first, then official provider evidence, then live arena signal.
This model has normalized benchmark rows, so scores here are directly comparable across benchmark sources.
Sign in to join the discussion
0
Downloads
0
Likes
Dec 2025
Released
These are recent benchmark or leaderboard claims from official provider sources. They are useful for freshness and context, but they are not treated the same as normalized independent benchmark rows.
Claude-Sonnet-4 - LiveCodeBench
LiveCodeBench pass@1 59.4 across 1055 tasks
View sourceIntroducing Sonnet 4.6
It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 brings much-improved coding skills to more of our users. Performance that would have previously required reaching for an Opus-class model—including on real-world, economically valuable office tasks —is now available with Sonnet 4.6.
View sourceIntroducing Claude 4
Skip to main content Skip to footer Research Economic Futures Commitments Learn News Try Claude Announcements Introducing Claude 4 May 22, 2025 Today, we’re introducing the next generation of Claude models: Claude Opus 4 and Claude Sonnet 4 , setting new standards for coding, advanced reasoning, and AI agents. Claude Opus 4 is the world’s best coding model, with sustained performance on complex, long-running tasks and agent workflows. Claude Sonnet 4 is a significant upgrade
View source1275
ELO Score
1267 - 1283
95% Confidence
+/-8 points
8.9K
Battles
May 21, 2026
Last Updated