▸ SPEC SHEET
TinyLlama 1.1B — 1.1B Dense.
▸ SPECIFICATIONS
- PARAMETERS
- 1.1B
- ARCHITECTURE
- Dense Transformer
- CONTEXT LENGTH
- 2K tokens
- CAPABILITIES
- chat
- RELEASE DATE
- 2024-01-08
- PROVIDER
- TinyLlama
- FAMILY
- llama
▸ VRAM REQUIREMENTS
| QUANT | BPW | VRAM | QUALITY |
|---|---|---|---|
| Q4_K_M | 4.89 | 1.2 GB | 94% |
| Q5_K_S | 5.57 | 1.3 GB | 96% |
| Q5_K_M | 5.7 | 1.3 GB | 96% |
| Q6_K | 6.56 | 1.4 GB | 97% |
| Q8_0 | 8.5 | 1.7 GB | 100% |
| FP16 | 16 | 2.7 GB | 100% |
§ 01BENCHMARK SCORES
MMLU-PRO7.6
MATH7.0
IFEval57.0
BBH8.7
GPQA3.4
MUSR3.0
BigCodeBench8.2
§ 02RUN COMMAND
Run TinyLlama 1.1B locally with Ollama — needs 1.2 GB VRAM at Q4_K_M:
$
ollama run tinyllama:1.1b§ 03COMPATIBLE GPUs
30 @ Q4_K_MNVIDIA GeForce GTX 470
1 GB · 133.9 GB/s
NVIDIA GeForce GTX 570
1 GB · 152 GB/s
NVIDIA GeForce GTX 570 Rev. 2
1 GB · 152 GB/s
NVIDIA GeForce GTX 460 v2 ES
1 GB · 128.3 GB/s
NVIDIA GeForce GTX 560 OEM
1 GB · 128.3 GB/s
NVIDIA GeForce GTX 560 Ti 448
1 GB · 152 GB/s
NVIDIA Quadro FX 5600
2 GB · 76.8 GB/s
NVIDIA Quadro FX 5600 Mac Edition
2 GB · 76.8 GB/s
NVIDIA Tesla C870
2 GB · 76.8 GB/s
NVIDIA Tesla D870
2 GB · 76.8 GB/s
NVIDIA Tesla S870
2 GB · 76.8 GB/s
NVIDIA Quadro CX
2 GB · 76.8 GB/s
NVIDIA Quadro FX 4800
2 GB · 76.8 GB/s
NVIDIA GeForce GT 230 OEM
2 GB · 24 GB/s
NVIDIA GeForce GT 440 OEM
2 GB · 43.2 GB/s