Qwen3 VL 32B Instruct Benchmark Update
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 70.6 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQwen
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...
Running this yourself: likely needs a rented cloud gpu.
41.5
Quality Score
---
Arena ELO
32B
Parameters
262K
Context
Sign in to join the discussion
0
Downloads
0
Likes
Oct 2025
Released
Benchmarks
19
Open Source
1
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 70.6 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 17.2/100 | Price: $1.225/M tokens | Output: 70.6 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 68.594 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 54.689 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 17.2/100 | Price: $1.225/M tokens | Output: 54.933 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 17.2/100 | Price: $1.225/M tokens | Output: 55.23 tok/s | MMLU: 0.791% | HumanEval: 0.514%
View sourceQuality: 17.2/100 | Price: $1.225/M tokens | Output: 68.594 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 54.689 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 54.933 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 55.23 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 54.882 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 57.97 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 64.629 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 74.299 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 76.098 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 77.205 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 77.273 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 75.518 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 84.352 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 83.329 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 80.757 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 79.568 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 78.61 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Quality: 17.2/100 | Price: $1.225/M tokens | Output: 82.685 tok/s | MMLU: 0.791% | HumanEval: 0.514%
Qwen3 VL 32B Instruct is now available through local Ollama runtime and Ollama Cloud. 256K context window listed. The most powerful vision-language model in the Qwen model family to date.