Google's most capable Gemini 2.5 model with a 1 million token context window and deep thinking mode. Google lists it on the Gemini API deprecations schedule with Gemini 3 Pro as the recommended replacement.
Google's most capable thinking model with native multimodal understanding.
62.0
Quality Score
1474
Arena ELO
Undisclosed
Parameters
1M
Context
Use this section to answer one simple question first: how much outside evidence do we have that this model performs well? Structured benchmark scores appear first, then official provider evidence, then live arena signal.
This model has normalized benchmark rows, so scores here are directly comparable across benchmark sources.
Sign in to join the discussion
0
Downloads
0
Likes
Mar 2025
Released
These are recent benchmark or leaderboard claims from official provider sources. They are useful for freshness and context, but they are not treated the same as normalized independent benchmark rows.
Gemini 2.5 Pro Benchmark Update
Quality: 27/100 | Price: $3.438/M tokens | Output: 0 tok/s | MMLU: 0.862% | HumanEval: 0.801%
View sourcegemini-2.5 - Arena-Hard-Auto
Arena-Hard-Auto official Gemini-2.5 judged score 79.0 with CI -2.1/1.8
View sourcegemini-2.5-pro - SWE-Bench Verified
SWE-Bench Verified resolved rate 53.6
View sourceGemini 2.5 Pro Benchmark Update
Quality: 27/100 | Price: $3.438/M tokens | Output: 0 tok/s | MMLU: 0.862% | HumanEval: 0.801%
View sourceGemini 2.5 Pro Benchmark Update
Quality: 27/100 | Price: $3.438/M tokens | Output: 120.586 tok/s | MMLU: 0.862% | HumanEval: 0.801%
View sourceGemini 2.5 Pro Benchmark Update
Quality: 27/100 | Price: $3.438/M tokens | Output: 131.958 tok/s | MMLU: 0.862% | HumanEval: 0.801%
View source1246
ELO Score
1241 - 1251
95% Confidence
+/-5 points
89.5K
Battles
Jun 23, 2026
Last Updated
1474
ELO Score
1470 - 1479
95% Confidence
+/-5 points
19.2K
Battles
Jun 23, 2026
Last Updated