Alibaba/Mixture of Experts

Qwen3-Coder-480B-A35B

chattool_usecoding

480.2B

Parameters (35B active)

256K

Context length

1

Benchmarks

17

Quantizations

87K

HF downloads

Architecture

MoE

Released

2025-11-01

Layers

62

KV Heads

8

Head Dim

128

Family

qwen

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_XXS	2.38	143.3 GB	low
IQ2_M	2.93	176.4 GB	low
Q2_K	3.16	190.2 GB	low
IQ3_XXS	3.25	195.6 GB	low
IQ3_XS	3.5	210.6 GB	low
Q3_K_S	3.64	219.0 GB	low
IQ3_M	3.76	226.2 GB	low
Q3_K_M	4	240.6 GB	low
Q3_K_L	4.3	258.6 GB	moderate
IQ4_XS	4.46	268.2 GB	moderate
Q4_K_S	4.67	280.8 GB	moderate
Q4_K_M	4.89	294.0 GB	good
Q5_K_S	5.57	334.8 GB	good
Q5_K_M	5.7	342.6 GB	good
Q6_K	6.56	394.3 GB	excellent
Q8_0	8.5	510.7 GB	lossless
FP16	16	960.9 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Benchmarks (1)

SWE-bench69.6

Run this model

Easiest way to get starteddocs →

curl -fsSL https://ollama.com/install.sh | sh

$ollama run qwen3:480.2b-instruct-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

HuggingFace Ollama Library GGUF Downloads Build Hardware

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Apple M4 Ultra (384GB)

384 GB VRAM • 1092 GB/s

Apple M5 Ultra (384GB)

384 GB VRAM • 1228 GB/s

Find the best GPU for Qwen3-Coder-480B-A35B

Build Hardware for Qwen3-Coder-480B-A35B

Qwen3-Coder-480B-A35B — 480.2B Parameter Mixture of Experts LLM

Model Specifications

Parameters: 480.2B (35B active)
Architecture: Mixture of Experts
Context Length: 256K tokens
Capabilities: chat, tool_use, coding
Release Date: 2025-11-01
Provider: Alibaba
Family: qwen

VRAM Requirements

Quantization	BPW	VRAM	Quality
IQ2_XXS	2.38	143.3 GB	65%
IQ2_M	2.93	176.4 GB	75%
Q2_K	3.16	190.2 GB	78%
IQ3_XXS	3.25	195.6 GB	82%
IQ3_XS	3.5	210.6 GB	84%
Q3_K_S	3.64	219.0 GB	85%
IQ3_M	3.76	226.2 GB	86%
Q3_K_M	4	240.6 GB	88%
Q3_K_L	4.3	258.6 GB	90%
IQ4_XS	4.46	268.2 GB	92%
Q4_K_S	4.67	280.8 GB	93%
Q4_K_M	4.89	294.0 GB	94%
Q5_K_S	5.57	334.8 GB	96%
Q5_K_M	5.7	342.6 GB	96%
Q6_K	6.56	394.3 GB	97%
Q8_0	8.5	510.7 GB	100%
FP16	16	960.9 GB	100%

Benchmark Scores

SWE-bench69.6

How to Run Qwen3-Coder-480B-A35B

Run Qwen3-Coder-480B-A35B locally with Ollama (needs 294.0 GB VRAM at Q4_K_M):

ollama run qwen3:480.2b

Compatible GPUs (2)

GPUs that can run Qwen3-Coder-480B-A35B at Q4_K_M quantization:

Apple M4 Ultra (384GB)(384GB, 1092 GB/s)Apple M5 Ultra (384GB)(384GB, 1228 GB/s)