sonnet

Name: sonnet
Rating: 43.7 (1 reviews)
Author: Google

#340SpecializedOpen Weights

Google

sonnet is a open-weight Google specialized model.

Running this yourself: can likely run on your own machine.

Model updates refreshed28m agoJul 5, 2026news + changelog

View Updates Self-Host

43.7

Quality Score

1216

Arena ELO

Unknown

Parameters

---

Context

Benchmarks and Competitive Signal

Structured

Use this section to answer one simple question first: how much outside evidence do we have that this model performs well? Structured benchmark scores appear first, then official provider evidence, then live arena signal.

This model has normalized benchmark rows, so scores here are directly comparable across benchmark sources.

OSWorldmultimodal

Similar Models

Phi-4-reasoning-vision-15B#307

Microsoft·

Discussion (0)

Loading comments...

Official Benchmark Evidence

These are recent benchmark or leaderboard claims from official provider sources. They are useful for freshness and context, but they are not treated the same as normalized independent benchmark rows.

claude-3-5-sonnet-20241022 - Arena-Hard-Auto

Benchmarksarena-hard-autoJul 5, 2026

Arena-Hard-Auto official Gemini-2.5 judged score 33.0 with CI -2.3/1.8

View source

OpenHands + CodeAct v2.1 (claude-3-5-sonnet-20241022) - SWE-Bench Verified

Benchmarksswe-benchJul 5, 2026

SWE-Bench Verified resolved rate 53.0

View source

Claude-3.5-Sonnet-20241022 - LiveCodeBench

BenchmarkslivecodebenchJul 5, 2026

LiveCodeBench pass@1 48.7 across 1055 tasks

View source

claude-3-5-sonnet — LiveBench Scores

BenchmarkslivebenchJul 5, 2026

language: 0.5 | coding: 0.0 | instruction_following: 1.0 | Overall: 0.5

View source

claude-3-5-sonnet — LiveBench Scores

BenchmarkslivebenchJul 4, 2026

language: 0.5 | coding: 0.0 | instruction_following: 1.0 | Overall: 0.5

View source

claude-3-5-sonnet — LiveBench Scores

BenchmarkslivebenchJul 3, 2026

language: 0.5 | coding: 0.0 | instruction_following: 1.0 | Overall: 0.5

View source

Arena ELO Ratings

Chatbot Arena

107 snapshotsArena Rank #112

1216

ELO Score

1214 - 1218

95% Confidence

+/-2 points

113.1K

Battles

Jul 5, 2026

Last Updated

90012001500

Vision Arena

106 snapshotsArena Rank #120

1016

ELO Score

1005 - 1027

95% Confidence

+/-11 points

12.3K

Battles

Jul 5, 2026

Last Updated

90012001500