Mistral AI
Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...
Running this yourself: can likely run on your own machine.
38.0
Quality Score
1163
Arena ELO
Unknown
Parameters
131K
Context
Sign in to join the discussion
0
Downloads
0
Likes
Aug 2025
Released
Launches
3
Benchmarks
20
Research
2
General
5
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 85.2 tok/s | MMLU: 0.683% | HumanEval: 0.406%
We're taking on the hardest problems in the real world 🏗️🚚 🛫⚛️ Today at The AI Now Summit, held at the Louvre, we announced AI solutions for aerospace, automotive, energy, and physics. Deployed in production at @Airbus , @BMW, @EDFofficiel , and more. More below: https://t.co/L6Vq7RdABg
View sourceQuality: 14.7/100 | Price: $0.8/M tokens | Output: 85.2 tok/s | MMLU: 0.683% | HumanEval: 0.406%
View sourceQuality: 14.7/100 | Price: $0.8/M tokens | Output: 84.541 tok/s | MMLU: 0.683% | HumanEval: 0.406%
View sourceCustomers are not an abstraction for us: we exist to help enterprises, public institutions, and industries build their own intelligence, so the value created from their data, workflows, feedback, and models accrues to them rather than to model providers.


Mistral officially has 1,000 team members around the world! Thank you to our incredible talent for believing in the mission and contributing to what we are building each and every day. You too can be a part of the next chapter in our journey as we continue to grow and serve our https://t.co/nkxE0IQJi2

We're taking on the hardest problems in the real world 🏗️🚚 🛫⚛️ Today at The AI Now Summit, held at the Louvre, we announced AI solutions for aerospace, automotive, energy, and physics. Deployed in production at @Airbus , @BMW, @EDFofficiel , and more. More below: https://t.co/L6Vq7RdABg
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 84.541 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 82.436 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 81.527 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 86.923 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.7/100 | Price: $0.8/M tokens | Output: 86.774 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 92.461 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 86.538 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 85.25 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 85.05 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 89.167 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 90.572 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 91.177 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 91.773 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 91.255 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 88.972 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 86.415 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 82.129 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 78.205 tok/s | MMLU: 0.683% | HumanEval: 0.406%
Quality: 14.8/100 | Price: $0.8/M tokens | Output: 76.269 tok/s | MMLU: 0.683% | HumanEval: 0.406%
We show the standard basis of transformer hidden states already provides a training-free, architecture-general feature basis. Individual dimensions encode semantic content via their signs (+/-1) and confidence via their magnitudes, acting as independent binary registers; a feature is a subset of dimensions with a consistent sign pattern, read by counting sign agreements with no learned rotation. We validate this Bag of Dims framework across seven models spanning language (Qwen 3.5-4B, Gemma 3-4B, Mistral 7B, Qwen3-32B), vision (DINOv2, ViT-Base), and audio (AST). Signs alone carry predictive content: unit-magnitude sign patterns preserve 60-93% top-5 next-token accuracy through the LM head, and decoder-free Hamming scoring reaches 80-90% top-4096. From a single-token cache (one forward pass per token, no context, no labels), we detect 175 categories at AUC 0.97-0.99 by sign agreement; a trained probe adds only +0.018 AUC and converges to axis-aligned weights. These features are causally operative: they survive the K/V attention projections, trace to the FFN neuron coalitions that write them (random-weight controls never reproduce this), and flipping a feature's signs during the live forward pass suppresses its concept across four language models, magnitude-matched and concept-specific. Dimensions stay independent throughout (pairwise mutual information below 0.006 bits). The structure is not specific to language: the same per-dimension signs appear in self-supervised vision (DINOv2, 9/12 ImageNet superclasses), supervised vision (ViT-Base, 11/12), and audio (AST, 50/50 ESC-50 categories), so it reflects transformer training in general, not the language-modeling objective. The standard basis already suffices for feature reading at one forward pass, no optimization, no GPU-days. The open problem shifts from finding the right rotation to cataloging what each dimension encodes.