Qwen3 VL 32B Instruct Benchmark Update
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.045 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQwen
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...
Running this yourself: likely needs a rented cloud gpu.
26.1
Quality Score
---
Arena ELO
32B
Parameters
262K
Context
Sign in to join the discussion
0
Downloads
0
Likes
Oct 2025
Released
Benchmarks
19
Open Source
1
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.045 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 11.1/100 | Price: $1.225/M tokens | Output: 72.045 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.597 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 75.265 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 11.1/100 | Price: $1.225/M tokens | Output: 72.597 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 11.1/100 | Price: $1.225/M tokens | Output: 68.609 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 11.1/100 | Price: $1.225/M tokens | Output: 72.597 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 75.265 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.597 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 68.609 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 68.511 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.309 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 71.745 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 69.237 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 68.65 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 69.089 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 70.179 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 70.88 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.073 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 74.08 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 76.14 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.251 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.878 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 11.1/100 | Price: $1.225/M tokens | Output: 72.878 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Qwen3 VL 32B Instruct is now available through local Ollama runtime and Ollama Cloud. 256K context window listed. The most powerful vision-language model in the Qwen model family to date.