Qwen 3.5 4B — 4B Parameter Dense LLM
Model Specifications
- Parameters
- 4B
- Architecture
- Dense Transformer
- Context Length
- 256K tokens
- Capabilities
- chat, coding, reasoning, multilingual, vision, math
- Release Date
- 2026-03-01
- Provider
- Alibaba
- Family
- qwen
VRAM Requirements
| Quantization | BPW | VRAM | Quality |
|---|---|---|---|
| Q4_K_M | 4.89 | 2.9 GB | 94% |
| Q5_K_S | 5.57 | 3.3 GB | 96% |
| Q5_K_M | 5.7 | 3.3 GB | 96% |
| Q6_K | 6.56 | 3.8 GB | 97% |
| Q8_0 | 8.5 | 4.7 GB | 100% |
| FP16 | 16 | 8.5 GB | 100% |
Benchmark Scores
MMLU-PRO79.1
IFEval89.8
MMMU77.6
MMBench89.4
GPQA Diamond76.2
LiveCodeBench55.8
How to Run Qwen 3.5 4B
Run Qwen 3.5 4B locally with Ollama (needs 2.9 GB VRAM at Q4_K_M):
ollama run qwen3.5:4bCompatible GPUs (30)
GPUs that can run Qwen 3.5 4B at Q4_K_M quantization:
NVIDIA Tesla C2050(3GB, 144 GB/s)NVIDIA Tesla M2050(3GB, 148 GB/s)NVIDIA Tesla S2050(3GB, 148 GB/s)NVIDIA GeForce GTX 670MX(3GB, 67 GB/s)AMD Radeon HD 7950(3GB, 240 GB/s)AMD Radeon HD 7950 Boost(3GB, 240 GB/s)AMD Radeon HD 7950 Monica BIOS 1(3GB, 240 GB/s)AMD Radeon HD 7950 Monica BIOS 2(3GB, 240 GB/s)AMD Radeon HD 7970(3GB, 264 GB/s)AMD Radeon HD 7970 GHz Edition(3GB, 288 GB/s)AMD Radeon HD 7970 X2(3GB, 264 GB/s)NVIDIA GeForce GTX 770M(3GB, 96 GB/s)NVIDIA GeForce GTX 780(3GB, 288 GB/s)NVIDIA GeForce GTX 780 Rev. 2(3GB, 288 GB/s)NVIDIA GeForce GTX 780 Ti(3GB, 337 GB/s)