Qwen3.5 - GAIA
GAIA score 32.6 from TJ-0405
View sourceQwen
Qwen3.5-0.8B is a open-weight Qwen specialized model.
Running this yourself: can likely run on your own machine.
47.5
Quality Score
---
Arena ELO
800M
Parameters
---
Context
Sign in to join the discussion
2.1M
Downloads
466
Likes
Feb 2026
Released
Benchmarks
4
Open Source
1
Research
1
Recent launch, pricing, benchmark, and API signals linked to this model or its provider.
GAIA score 32.6 from TJ-0405
View sourceSWE-Bench Verified resolved rate 69.6
View sourceSWE-Bench Verified resolved rate 69.6
GAIA score 44.2 from WA0824
View sourceQwen3.5-0.8B is now available through local Ollama runtime. 256K context window listed. Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
View sourceWe introduce WriteSAE, the first sparse autoencoder that decomposes and edits the matrix cache write of state-space and hybrid recurrent language models, where residual SAEs cannot reach. Existing SAEs read residual streams, but Gated DeltaNet, Mamba-2, and RWKV-7 write to a d_k times d_v cache through rank-1 updates k_t v_t^top that no vector atom can replace. WriteSAE factors each decoder atom into the native write shape, exposes a closed form for the per-token logit shift, and trains under matched Frobenius norm so atoms swap one cache slot at a time. Atom substitution beats matched-norm ablation on 92.4% of n=4{,}851 firings at Qwen3.5-0.8B L9 H4, the 87-atom population test holds at 89.8%, the closed form predicts measured effects at R^2=0.98, and Mamba-2-370M substitutes at 88.1% over 2,500 firings. Sustained three-position installs at 3times lift midrank target-in-continuation from 33.3% to 100% under greedy decoding, the first behavioral install at the matrix-recurrent write site.
Qwen3.5-0.8B is now available through local Ollama runtime. 256K context window listed. Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
SWE-Bench Verified resolved rate 69.6
SWE-Bench Verified resolved rate 69.6