▸ DEVICE UNDER TEST
NVIDIA RTX A5000 Max-Q — 16 GB VRAM.
▸ RTX A5000 MAX-Q SPEC
- BRAND
- NVIDIA
- VRAM
- 16 GB GDDR6
- BANDWIDTH
- 384 GB/s
- FP16 COMPUTE
- 16.6 TFLOPS
- FP32 COMPUTE
- 16.6 TFLOPS
- CUDA CORES
- 6,144
- TENSOR CORES
- 192
- TDP
- 80 W
- ARCHITECTURE
- Ampere
▸ AI CAPABILITY
202/ 331 models @ Q4
With 16 GB VRAM and 384 GB/s bandwidth, this GPU handles models up to 22.2B parameters.
Speed ≈ bandwidth / model_size × efficiency. A 7B model at Q4 runs at ~44 tok/s.
§ 01TOP MODELS FOR RTX A5000 MAX-Q
202 FIT · SHOWING 20| MODEL | SIZE | VRAM Q4 | TOK/S | AVG |
|---|---|---|---|---|
| Codestral 22B | 22.2B | 14.1 GB | 14 | 50.1 |
| Devstral Small 22B | 22.2B | 14.1 GB | 14 | 35.5 |
| Mistral Small 22B | 22.2B | 14.1 GB | 14 | 35.2 |
| SOLAR-Pro 22B | 22.1B | 14.0 GB | 14 | 44.2 |
| ERNIE 4.5 21B A3B | 21B | 13.3 GB | 102 | — |
| GPT-OSS 20B | 21B | 13.3 GB | 85 | 52.9 |
| InternLM2 20B | 19.8B | 12.6 GB | 16 | 45.1 |
| InternLM2.5 20B | 19.8B | 12.6 GB | 16 | 50.9 |
| Ling-lite 16.8B | 16.8B | 10.8 GB | 128 | — |
| DeepSeek V2 Lite 16B | 16B | 10.3 GB | 128 | 38.0 |
| DeepSeek-Coder-V2-Lite 15.7B | 15.7B | 10.1 GB | 128 | 43.0 |
| DeepSeek-VL2 Small 16B | 15.7B | 10.1 GB | 128 | 43.1 |
| StarCoder 15B | 15.5B | 10.0 GB | 20 | 21.0 |
| StarCoder2 15B | 15B | 9.7 GB | 20 | 26.5 |
| DeepSeek R1 Distill Qwen 14B | 14.8B | 9.5 GB | 21 | 43.9 |
| DeepCoder 14B | 14.8B | 9.5 GB | 21 | 38.7 |
| Qwen2.5-Coder-14B | 14.8B | 9.5 GB | 21 | 41.3 |
| Qwen2.5-14B | 14.8B | 9.5 GB | 21 | 41.3 |
| Qwen3 14B | 14.8B | 9.5 GB | 21 | 45.7 |
| Ministral 3 14B | 14B | 9.0 GB | 22 | 25.9 |