Grok-3

#646Large Language ModelsProprietary

xAI

xAI's most powerful model trained on real-time data with strong reasoning capabilities.

Model updates refreshed35m agoJul 5, 2026news + changelog

View Updates

What changed

xAI flagship with real-time knowledge

44.8

Quality Score

1361

Arena ELO

Undisclosed

Parameters

131K

Context

Benchmarks and Competitive Signal

Structured

Use this section to answer one simple question first: how much outside evidence do we have that this model performs well? Structured benchmark scores appear first, then official provider evidence, then live arena signal.

This model has normalized benchmark rows, so scores here are directly comparable across benchmark sources.

LiveBench Languagelanguage

Similar Models

Discussion (0)

Loading comments...

Official Benchmark Evidence

These are recent benchmark or leaderboard claims from official provider sources. They are useful for freshness and context, but they are not treated the same as normalized independent benchmark rows.

grok-3 — LiveBench Scores

BenchmarkslivebenchJul 5, 2026

language: 0.5 | coding: 0.3 | Overall: 0.4

View source

Grok 3 Benchmark Update

Benchmarksartificial-analysisJul 4, 2026

Quality: 18.4/100 | Price: $8/M tokens | Output: 0 tok/s | MMLU: 0.799% | HumanEval: 0.425%

View source

Grok-3-Mini (High) - LiveCodeBench

BenchmarkslivecodebenchJul 4, 2026

LiveCodeBench pass@1 78.1 across 1055 tasks

View source