▸ DEVICE UNDER TEST
NVIDIA Quadro CX — 2 GB VRAM.
▸ QUADRO CX SPEC
- BRAND
- NVIDIA
- VRAM
- 2 GB GDDR3
- BANDWIDTH
- 76.8 GB/s
- FP16 COMPUTE
- 0.5 TFLOPS
- FP32 COMPUTE
- 0.5 TFLOPS
- CUDA CORES
- 192
- TDP
- 150 W
- ARCHITECTURE
- Tesla 2.0
▸ AI CAPABILITY
26/ 331 models @ Q4
With 2 GB VRAM and 76.8 GB/s bandwidth, this GPU handles models up to 1.3B parameters.
Speed ≈ bandwidth / model_size × efficiency. A 7B model at Q4 runs at ~9 tok/s.
§ 01TOP MODELS FOR QUADRO CX
26 FIT · SHOWING 20| MODEL | SIZE | VRAM Q4 | TOK/S | AVG |
|---|---|---|---|---|
| DeepSeek Coder 1.3B | 1.3B | 1.3 GB | 47 | 16.8 |
| EXAONE-4.0-1.2B | 1.3B | 1.3 GB | 47 | 18.9 |
| OPT 1.3B | 1.3B | 1.3 GB | 47 | 5.3 |
| Phi-1 1.3B | 1.3B | 1.3 GB | 47 | 7.2 |
| Phi-1.5 1.3B | 1.3B | 1.3 GB | 47 | 7.2 |
| LFM2.5-1.2B-Thinking | 1.2B | 1.2 GB | 51 | 19.6 |
| Llama-3.2-1B | 1.2B | 1.2 GB | 51 | 10.1 |
| TinyLlama 1.1B | 1.1B | 1.2 GB | 56 | 13.6 |
| Falcon3-1B | 1B | 1.1 GB | 61 | 42.0 |
| InternLM2 1B | 1B | 1.1 GB | 61 | — |
| Qwen3.5-0.8B | 0.9B | 1.0 GB | 68 | 20.5 |
| Qwen 3.5 0.8B | 0.8B | 1.0 GB | 77 | 23.4 |
| GPT-2 Large 774M | 0.774B | 1.0 GB | 79 | 5.6 |
| Qwen3 0.6B | 0.6B | 0.9 GB | 102 | 19.1 |
| BGE-M3 | 0.568B | 0.8 GB | 108 | 63.0 |
| Falcon-H1 0.5B | 0.5B | 0.8 GB | 123 | 41.7 |
| Qwen 1.5 0.5B | 0.5B | 0.8 GB | 123 | 9.9 |
| Qwen 2.5 0.5B | 0.5B | 0.8 GB | 123 | 19.4 |
| SmolLM2 360M | 0.36B | 0.7 GB | 171 | 8.2 |
| GPT-2 Medium 345M | 0.345B | 0.7 GB | 178 | 5.9 |