GPT-4.1 Mini

Name: GPT-4.1 Mini
Price: 20 USD
Availability: InStock
Rating: 42.9 (1 reviews)
Author: OpenAI

#150Large Language ModelsProprietary

OpenAI

Compact, cost-efficient version of GPT-4.1 retaining the 1 million token context window. Ideal for high-throughput, latency-sensitive applications.

Model updates refreshed11h agoJul 13, 2026news + changelog

Website View Updates Subscribe

42.9

Quality Score

1337

Arena ELO

Undisclosed

Parameters

Context

Benchmarks and Competitive Signal

Structured

Use this section to answer one simple question first: how much outside evidence do we have that this model performs well? Structured benchmark scores appear first, then official provider evidence, then live arena signal.

This model has normalized benchmark rows, so scores here are directly comparable across benchmark sources.

BigCodeBenchcode

Similar Models

Discussion (0)

Loading comments...

Official Benchmark Evidence

These are recent benchmark or leaderboard claims from official provider sources. They are useful for freshness and context, but they are not treated the same as normalized independent benchmark rows.

gpt-4.1-mini - Arena-Hard-Auto

Benchmarksarena-hard-autoJul 13, 2026

Arena-Hard-Auto official Gemini-2.5 judged score 46.9 with CI -2.4/2.1

View source

gpt-4.1-mini-20250414 - SWE-Bench Verified

Benchmarksswe-benchJul 13, 2026

SWE-Bench Verified resolved rate 23.9

View source

Introducing GPT‑5 for developers

Benchmarksprovider-benchmarksJul 9, 2026

Introducing GPT‑5 for developers | OpenAI Skip to main content Research Products Business Developers Company Foundation (opens in a new window) Log in Try ChatGPT (opens in a new window) Research Products Business Developers Company Foundation (opens in a new window) Try ChatGPT (opens in a new window) Login OpenAI August 7, 2025 Product Introducing GPT‑5 for developers The best model for coding and agentic tasks. Loading… Share Introduction Introduction Coding Frontend engin

View source

We audited SWE-Bench Pro, one of the most widely used AI coding benchmarks, and found it no longer reliably measures frontier coding capability. We find 30% of SWE-Bench Pro tasks to be broken, and ar

Benchmarksx-twitterJul 8, 2026

View source

We audited SWE-Bench Pro, one of the most widely used AI coding benchmarks, and found it no longer reliably measures frontier coding capability. We find the eval to be saturated at a ~70% noise ceilin

Benchmarksx-twitterJul 8, 2026

View source

gpt-4.1-mini-20250414 - SWE-Bench Verified

Benchmarksswe-benchApr 1, 2026

SWE-Bench Verified resolved rate 23.9

View source

Arena ELO Ratings

Vision Arena

109 snapshotsArena Rank #61

1202

ELO Score

1194 - 1210

95% Confidence

+/-8 points

40.9K

Battles

Jul 13, 2026

Last Updated

90012001500

Chatbot Arena

109 snapshotsArena Rank #31

1337

ELO Score

1333 - 1341

95% Confidence

+/-4 points

19.4K

Battles

Jul 13, 2026

Last Updated

90012001500