Alibaba/Mixture of Experts

Qwen 3.5 35B A3B

chatcodingreasoningmultilingualvisionmath
35B
Parameters (3B active)
256K
Context length
14
Benchmarks
4
Quantizations
300K
HF downloads
Architecture
MoE
Released
2026-02-01
Layers
64
KV Heads
4
Head Dim
128
Family
qwen

Quantizations & VRAM

Q4_K_M4.5 bpw
20.2 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
28.9 GB
VRAM required
97%
Quality
Q8_08 bpw
35.5 GB
VRAM required
100%
Quality
FP1616 bpw
70.5 GB
VRAM required
100%
Quality

Benchmarks (14)

Arena Elo1485
IFEval91.9
MMBench91.5
MMLU-PRO85.3
GPQA Diamond81.9
MMMU75.1
MATH59.7
BBH58.3
BigCodeBench32.3
AA Intelligence30.7
MUSR19.1
AA Coding16.8
GPQA15.2
HLE12.8

Run with Ollama

$ollama run qwen3.5:35b-a3b

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for Qwen 3.5 35B A3B

Build Hardware for Qwen 3.5 35B A3B