minimax-m2.5 - SWE-Bench Verified
SWE-Bench Verified resolved rate 75.8
View sourceMiniMax
MiniMax reasoning and coding model family tuned for agentic software workflows and long-context production runs, with open weights for private cluster deployment.
Running this yourself: can likely run on your own machine.
45.9
Quality Score
---
Arena ELO
Unknown
Parameters
1M
Context
Sign in to join the discussion
0
Downloads
0
Likes
Mar 2026
Released
Benchmarks
3
API
1
Open Source
2
Research
1
General
3
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
SWE-Bench Verified resolved rate 75.8
View sourceMiniMax 开源新评测集:定义Coding Agent 的生产级标准 - MiniMax News | MiniMax 模型 文本 MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 语音 MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 视频 MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 音乐 MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 产品 AI原生应用 MiniMax 桌面版 Agent 海螺视频 语音 星野 开放平台 即刻接入AI能力 文档中心 Token Plan 产品定价 平台登录 新闻动态 关于我们 与所有人共创智能 公司介绍 投资者关系 加入我们 EN 登录 API 开放平台 MiniMax Agent
MiniMax M2.5 - SOTA in Coding and Agent, Designed for Agent Universe | MiniMax Models TEXT MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 SPEECH MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 VIDEO MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 MUSIC MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 Product AI-native Applications MiniMax Desktop Agent Video Hailuo Audio Talkie API Develop On MiniMax Develop
View sourceDesigned for high-throughput, low-latency production environments. M2.5 delivers industry-leading coding and reasoning capabilities at a fraction of the cost.
View sourceMiniMax says the M2 family is open-sourced with official self-host guidance for private deployment using runtimes like vLLM and SGLang.
View sourceAgentic modeling aims to transform LLMs into autonomous agents capable of solving complex tasks through planning, reasoning, tool use, and multi-turn interaction with environments. Despite major investment, open research remains constrained by infrastructure and training gaps. Many high-performing systems rely on proprietary codebases, models, or services, while most open-source frameworks focus on orchestration and evaluation rather than scalable agent training. We present Orchard, an open-source framework for scalable agentic modeling. At its core is Orchard Env, a lightweight environment service providing reusable primitives for sandbox lifecycle management across task domains, agent harnesses, and pipeline stages. On top of Orchard Env, we build three agentic modeling recipes. Orchard-SWE targets coding agents. We distill 107K trajectories from MiniMax-M2.5 and Qwen3.5-397B, introduce credit-assignment SFT to learn from productive segments of unresolved trajectories, and apply Balanced Adaptive Rollout for RL. Starting from Qwen3-30B-A3B-Thinking, Orchard-SWE achieves 64.3% on SWE-bench Verified after SFT and 67.5% after SFT+RL, setting a new state of the art among open-source models of comparable size. Orchard-GUI trains a 4B vision-language computer-use agent using only 0.4K distilled trajectories and 2.2K open-ended tasks. It achieves 74.1%, 67.0%, and 64.0% success rates on WebVoyager, Online-Mind2Web, and DeepShop, respectively, making it the strongest open-source model while remaining competitive with proprietary systems. Orchard-Claw targets personal assistant agents. Trained with only 0.2K synthetic tasks, it achieves 59.6% pass@3 on Claw-Eval and 73.9% when paired with a stronger ZeroClaw harness. Collectively, these results show that a lightweight, open, harness-agnostic environment layer enables reusable agentic data, training recipes, and evaluations across domains.
MiniMax M2.5 is now available through Ollama Cloud. 198K context window listed. MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
Designed for high-throughput, low-latency production environments. M2.5 delivers industry-leading coding and reasoning capabilities at a fraction of the cost.
MiniMax says the M2 family is open-sourced with official self-host guidance for private deployment using runtimes like vLLM and SGLang.
SWE-Bench Verified resolved rate 75.8
MiniMax 开源新评测集:定义Coding Agent 的生产级标准 - MiniMax News | MiniMax 模型 文本 MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 语音 MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 视频 MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 音乐 MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 产品 AI原生应用 MiniMax 桌面版 Agent 海螺视频 语音 星野 开放平台 即刻接入AI能力 文档中心 Token Plan 产品定价 平台登录 新闻动态 关于我们 与所有人共创智能 公司介绍 投资者关系 加入我们 EN 登录 API 开放平台 MiniMax Agent
MiniMax M2.5 - SOTA in Coding and Agent, Designed for Agent Universe | MiniMax Models TEXT MiniMax M2.7 MiniMax M2.5 MiniMax M2-Her MiniMax M2.1 MiniMax M2 SPEECH MiniMax Speech 2.8 MiniMax Speech 2.6 MiniMax Speech 2.5 VIDEO MiniMax Hailuo 2.3 / 2.3 Fast MiniMax Hailuo 02 MUSIC MiniMax Music 2.6 MiniMax Music 2.5+ MiniMax Music 2.5 MiniMax Music 2.0 MiniMax Music 1.5 Product AI-native Applications MiniMax Desktop Agent Video Hailuo Audio Talkie API Develop On MiniMax Develop