Alibaba/Mixture of Experts

Qwen3-Next 80B A3B

Name: Qwen3-Next 80B A3B
Author: Alibaba

Over the past few months, we have observed increasingly clear trends toward scaling both total parameters and context lengths in the pursuit of more powerful and agentic artificial intelligence (AI).

chatcodingreasoningmultilingualtool_use

80B

Parameters (3B active)

256K

Context length

Benchmarks

Quantizations

Architecture

MoE

Released

2025-09-12

Layers

KV Heads

Head Dim

256

Family

qwen

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_M	2.93	29.8 GB	low
Q2_K	3.16	32.1 GB	low
IQ3_XXS	3.25	33.0 GB	low
IQ3_XS	3.5	35.5 GB	low
Q3_K_S	3.64	36.9 GB	low
IQ3_M	3.76	38.1 GB	low
Q3_K_M	4	40.5 GB	low
Q3_K_L	4.3	43.5 GB	moderate
IQ4_XS	4.46	45.1 GB	moderate
Q4_K_S	4.67	47.2 GB	moderate
Q4_K_M	4.89	49.4 GB	good
Q5_K_S	5.57	56.2 GB	good
Q5_K_M	5.7	57.5 GB	good
Q6_K	6.56	66.1 GB	excellent
Q8_0	8.5	85.5 GB	lossless
FP16	16	160.5 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

▸ READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN QWEN3-NEXT 80B A3B NOW

Rent on RunPod →Or Vast.ai →

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (20)

IFEval85.9

AIME84.3

AA Math84.3

LiveCodeBench78.4

GPQA Diamond75.9

MATH-50066.3

IFBench60.7

BBH60.5

AA Long Context60.3

MATH60.1

MMLU-PRO50.4

τ²-Bench41.5

SciCode38.8

BigCodeBench33.2

AA Intelligence26.7

AA Coding19.5

GPQA19.4

MUSR12.3

HLE11.7

Terminal-Bench9.8

Run this model

▸Easiest way to get started·Beginners

DOCS ↗

curl -fsSL https://ollama.com/install.sh | sh

$ollama run qwen:80b-q4_K_M

Tag may need adjustment — check ollama.com/library/qwen for available tags.

▸ SETUP GUIDE

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more

Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

HuggingFace GGUF Downloads Build Hardware