Qwen3-8B — 8.2B Parameter Dense LLM
Model Specifications
- Parameters
- 8.2B
- Architecture
- Dense Transformer
- Context Length
- 40K tokens
- Capabilities
- chat
- Release Date
- 2025-04-28
- Provider
- Alibaba
- Family
- qwen
VRAM Requirements
| Quantization | BPW | VRAM | Quality |
|---|---|---|---|
| Q4_K_M | 4.89 | 5.5 GB | 94% |
| Q5_K_S | 5.57 | 6.2 GB | 96% |
| Q5_K_M | 5.7 | 6.3 GB | 96% |
| Q6_K | 6.56 | 7.2 GB | 97% |
| Q8_0 | 8.5 | 9.2 GB | 100% |
| FP16 | 16 | 16.9 GB | 100% |
Benchmark Scores
HumanEval85.0
MMLU-PRO55.0
MATH89.0
IFEval78.0
BBH36.6
GPQA45.0
MUSR15.5
Arena Elo1462.0
GPQA Diamond61.2
LiveCodeBench51.3
AIME63.7
MATH-50093.2
HLE5.6
AA Intelligence16.4
AA Coding7.8
AA Math63.7
How to Run Qwen3-8B
Run Qwen3-8B locally with Ollama (needs 5.5 GB VRAM at Q4_K_M):
ollama run qwen3:8.2bCompatible GPUs (30)
GPUs that can run Qwen3-8B at Q4_K_M quantization:
NVIDIA RTX 3050 6GB(6GB, 168 GB/s)Intel Arc A380(6GB, 186 GB/s)NVIDIA RTX 2060 6GB(6GB, 336 GB/s)NVIDIA GTX 1660 SUPER(6GB, 336 GB/s)NVIDIA GTX 1660 Ti(6GB, 288 GB/s)NVIDIA GTX 1060 6GB(6GB, 192 GB/s)NVIDIA Tesla C2070(6GB, 143 GB/s)NVIDIA Tesla C2075(6GB, 150 GB/s)NVIDIA Tesla C2090(6GB, 177 GB/s)NVIDIA Tesla M2070(6GB, 150 GB/s)NVIDIA Tesla M2070-Q(6GB, 150 GB/s)NVIDIA Tesla M2075(6GB, 150 GB/s)NVIDIA Tesla M2090(6GB, 177 GB/s)NVIDIA Tesla X2070(6GB, 177 GB/s)NVIDIA Tesla X2090(6GB, 177 GB/s)