minimax-m2.5 - SWE-Bench Verified
SWE-Bench Verified resolved rate 75.8
View sourceMiniMax
MiniMax reasoning and coding model family tuned for agentic software workflows and long-context production runs, with open weights for private cluster deployment.
55.5
Quality Score
---
Arena ELO
Undisclosed
Parameters
197K
Context
Sign in to join the discussion
899.2K
Downloads
1.4K
Likes
Feb 2026
Released
Benchmarks
6
API
1
Open Source
2
Research
1
General
3
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
SWE-Bench Verified resolved rate 75.8
View sourceQuality: 41.9/100 | Price: $0.525/M tokens | Output: 56.156 tok/s | HumanEval: 0.426%
MiniMax 开源新评测集:定义Coding Agent 的生产级标准 - MiniMax News | MiniMax 模型 文本 MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 语音 MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 视频 MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 音乐 MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 产品 AI原生应用 MiniMax 桌面版 Agent 海螺视频 语音 星野 开放平台 即刻接入AI能力 文档中心 Token Plan 产品定价 平台登录 新闻动态 关于我们 与所有人共创智能 公司介绍 投资者关系 加入我们 EN 登录 API 开放平台 MiniMax Agent
Quality: 41.9/100 | Price: $0.525/M tokens | Output: 56.156 tok/s | HumanEval: 0.426%
View sourceSWE-Bench Verified resolved rate 75.8
View sourceQuality: 41.9/100 | Price: $0.525/M tokens | Output: 47.255 tok/s | HumanEval: 0.426%
View sourceQuality: 41.9/100 | Price: $0.525/M tokens | Output: 47.255 tok/s | HumanEval: 0.426%
Quality: 36.1/100 | Price: $0.525/M tokens | Output: 54.955 tok/s | MMLU: 0.82% | HumanEval: 0.826%
We study parallel test-time scaling for long-horizon agentic tasks such as agentic search and deep research, where multiple rollouts are generated in parallel and aggregated into a final response. While such scaling has proven effective for chain-of-thought reasoning, agentic tasks pose unique challenges: trajectories are long, multi-turn, and tool-augmented, and outputs are often open-ended. Aggregating only final answers discards rich information from trajectories, while concatenating all trajectories exceeds the model's context window. To address this, we propose AggAgent, an aggregation agent that treats parallel trajectories as an environment. We equip it with lightweight tools to inspect candidate solutions and search across trajectories, enabling it to navigate and synthesize information on demand. Across six benchmarks and three model families (GLM-4.7, Qwen3.5, MiniMax-M2.5), AggAgent outperforms all existing aggregation methods-by up to 5.3% absolute on average and 10.3% on two deep research tasks-while adding minimal overhead, as the aggregation cost remains bounded by a single agentic rollout. Our findings establish agentic aggregation as an effective and cost-efficient approach to parallel test-time scaling.
MiniMax-M2.5 is now available through Ollama Cloud. 198K context window listed. MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
Designed for high-throughput, low-latency production environments. M2.5 delivers industry-leading coding and reasoning capabilities at a fraction of the cost.
MiniMax says the M2 family is open-sourced with official self-host guidance for private deployment using runtimes like vLLM and SGLang.
SWE-Bench Verified resolved rate 75.8
MiniMax 开源新评测集:定义Coding Agent 的生产级标准 - MiniMax News | MiniMax 模型 文本 MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 语音 MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 视频 MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 音乐 MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 产品 AI原生应用 MiniMax 桌面版 Agent 海螺视频 语音 星野 开放平台 即刻接入AI能力 文档中心 Token Plan 产品定价 平台登录 新闻动态 关于我们 与所有人共创智能 公司介绍 投资者关系 加入我们 EN 登录 API 开放平台 MiniMax Agent
SWE-Bench Verified resolved rate 75.8