Alibaba/Dense

AlibabaQwen3.5-4B

chatThinkingTool Use
4.7B
Parameters
256K
Context length
15
Benchmarks
6
Quantizations
1.8M
HF downloads
Architecture
Dense
Released
2025-06-01
Layers
40
KV Heads
8
Head Dim
128
Family
qwen

Quantization Options

QuantBitsVRAMQuality
Q4_K_M4.893.4 GBgood
Q5_K_S5.573.8 GBgood
Q5_K_M5.73.8 GBgood
Q6_K6.564.3 GBexcellent
Q8_08.55.5 GBlossless
FP16169.9 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN QWEN3.5-4B NOW

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (15)

τ²-Bench92.1
GPQA Diamond77.1
AA Long Context55.7
IFBench52.0
IFEval31.6
AA Intelligence27.1
Terminal-Bench18.2
AA Coding17.5
BBH16.3
SciCode16.1
MMLU-PRO15.5
HLE7.8
MUSR7.4
MATH2.8
GPQA2.2

Run this model

Easiest way to get started·Beginners
DOCS ↗
curl -fsSL https://ollama.com/install.sh | sh
$ollama run qwen3:4.7b-instruct-q4_K_M

Downloads and runs automatically. Add --verbose for speed stats.

▸ SETUP GUIDE
>_

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more
Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

NVIDIA Tesla C1080
4 GB VRAM • 102 GB/s
NVIDIA
NVIDIA Tesla K10
4 GB VRAM • 160 GB/s
NVIDIA
NVIDIA Tesla M4
4 GB VRAM • 88 GB/s
NVIDIA
AMD Radeon Instinct MI8
4 GB VRAM • 512 GB/s
AMD
NVIDIA RTX A1000 Embedded
4 GB VRAM • 224 GB/s
NVIDIA
NVIDIA RTX A2000 Embedded
4 GB VRAM • 192 GB/s
NVIDIA
Intel Arc A310
4 GB VRAM • 124 GB/s
INTEL
$79
Intel Arc A350
4 GB VRAM • 124 GB/s
INTEL
$99

Find the best GPU for Qwen3.5-4B

Build Hardware for Qwen3.5-4B

Read the full model card for detailed information about this model.

▸ SPEC SHEET

Qwen3.5-4B4.7B Dense.

▸ SPECIFICATIONS
PARAMETERS
4.7B
ARCHITECTURE
Dense Transformer
CONTEXT LENGTH
256K tokens
CAPABILITIES
chat
RELEASE DATE
2025-06-01
PROVIDER
Alibaba
FAMILY
qwen
▸ VRAM REQUIREMENTS
QUANTBPWVRAMQUALITY
Q4_K_M4.893.4 GB94%
Q5_K_S5.573.8 GB96%
Q5_K_M5.73.8 GB96%
Q6_K6.564.3 GB97%
Q8_08.55.5 GB100%
FP16169.9 GB100%
§ 01BENCHMARK SCORES
MMLU-PRO15.5
MATH2.8
IFEval31.6
BBH16.3
GPQA2.2
MUSR7.4
GPQA Diamond77.1
HLE7.8
AA Intelligence27.1
AA Coding17.5
aa_ifbench52.0
aa_terminal_bench18.2
aa_tau292.1
aa_scicode16.1
aa_lcr55.7
§ 02RUN COMMAND

Run Qwen3.5-4B locally with Ollama — needs 3.4 GB VRAM at Q4_K_M:

$ollama run qwen3:4.7b