NVIDIA-Nemotron-Nano-9B-v2 — 8.9B Parameter Dense LLM
Model Specifications
- Parameters
- 8.9B
- Architecture
- Dense Transformer
- Context Length
- 128K tokens
- Capabilities
- chat
- Release Date
- 2025-08-12
- Provider
- NVIDIA
- Family
- nemotron
VRAM Requirements
| Quantization | BPW | VRAM | Quality |
|---|---|---|---|
| Q4_K_M | 4.89 | 5.9 GB | 94% |
| Q5_K_S | 5.57 | 6.7 GB | 96% |
| Q5_K_M | 5.7 | 6.8 GB | 96% |
| Q6_K | 6.56 | 7.8 GB | 97% |
| Q8_0 | 8.5 | 9.9 GB | 100% |
| FP16 | 16 | 18.3 GB | 100% |
Benchmark Scores
IFEval90.3
MATH-50097.8
GPQA Diamond64.5
LiveCodeBench71.1
AIME72.1
HLE6.5
How to Run NVIDIA-Nemotron-Nano-9B-v2
Run NVIDIA-Nemotron-Nano-9B-v2 locally with Ollama (needs 5.9 GB VRAM at Q4_K_M):
ollama run nemotron:8bCompatible GPUs (30)
GPUs that can run NVIDIA-Nemotron-Nano-9B-v2 at Q4_K_M quantization:
NVIDIA RTX 3050 6GB(6GB, 168 GB/s)Intel Arc A380(6GB, 186 GB/s)NVIDIA RTX 2060 6GB(6GB, 336 GB/s)NVIDIA GTX 1660 SUPER(6GB, 336 GB/s)NVIDIA GTX 1660 Ti(6GB, 288 GB/s)NVIDIA GTX 1060 6GB(6GB, 192 GB/s)NVIDIA Tesla C2070(6GB, 143 GB/s)NVIDIA Tesla C2075(6GB, 150 GB/s)NVIDIA Tesla C2090(6GB, 177 GB/s)NVIDIA Tesla M2070(6GB, 150 GB/s)NVIDIA Tesla M2070-Q(6GB, 150 GB/s)NVIDIA Tesla M2075(6GB, 150 GB/s)NVIDIA Tesla M2090(6GB, 177 GB/s)NVIDIA Tesla X2070(6GB, 177 GB/s)NVIDIA Tesla X2090(6GB, 177 GB/s)