▸ DEVICE UNDER TEST
AMD Radeon Pro Vega 16 — 4 GB VRAM.
▸ RADEON PRO VEGA 16 SPEC
- BRAND
- AMD
- VRAM
- 4 GB HBM2
- BANDWIDTH
- 307 GB/s
- FP16 COMPUTE
- 4.9 TFLOPS
- FP32 COMPUTE
- 2.4 TFLOPS
- STREAM PROCESSORS
- 1,024
- TDP
- 75 W
- ARCHITECTURE
- GCN 5.0
▸ AI CAPABILITY
74/ 331 models @ Q4
With 4 GB VRAM and 307 GB/s bandwidth, this GPU handles models up to 4.7B parameters.
Speed ≈ bandwidth / model_size × efficiency. A 7B model at Q4 runs at ~35 tok/s.
§ 01TOP MODELS FOR RADEON PRO VEGA 16
74 FIT · SHOWING 20| MODEL | SIZE | VRAM Q4 | TOK/S | AVG |
|---|---|---|---|---|
| Qwen3.5-4B | 4.7B | 3.4 GB | 58 | 29.3 |
| InternLM2 5B | 4.5B | 3.2 GB | 61 | 47.6 |
| Gemma 3 4B | 4.3B | 3.1 GB | 63 | 22.8 |
| TranslateGemma 4B | 4B | 2.9 GB | 68 | 5.5 |
| MedGemma 1.5 4B | 4B | 2.9 GB | 68 | 5.5 |
| Qwen 1.5 4B | 4B | 2.9 GB | 68 | 12.6 |
| Qwen3 4B | 4B | 2.9 GB | 68 | 40.7 |
| Qwen 3.5 4B | 4B | 2.9 GB | 68 | 47.4 |
| Nemotron 3 Nano 4B | 3.97B | 2.9 GB | 69 | 32.0 |
| Phi-3.5 Mini 3.8B | 3.82B | 2.8 GB | 71 | 46.6 |
| phi-3-mini-4k 3.8B | 3.8B | 2.8 GB | 72 | 30.5 |
| Phi-4-mini 3.8B | 3.8B | 2.8 GB | 72 | 49.0 |
| Qwen2.5-VL-3B | 3.8B | 2.8 GB | 72 | 29.9 |
| granite-4.0-h-micro 3.2B | 3.2B | 2.4 GB | 85 | 18.4 |
| Llama-3.2-3B | 3.2B | 2.4 GB | 85 | 17.9 |
| Falcon3-3B | 3.1B | 2.4 GB | 88 | 25.7 |
| Qwen 2.5 3B | 3.1B | 2.4 GB | 88 | 37.2 |
| SmolLM3-3B | 3.1B | 2.4 GB | 88 | 30.5 |
| Cogito 3B | 3B | 2.3 GB | 91 | 22.1 |
| Falcon-H1 3B | 3B | 2.3 GB | 91 | 49.5 |