▸ SPEC SHEET
granite-4.0-h-tiny 6.9B — 6.9B MoE.
▸ SPECIFICATIONS
- PARAMETERS
- 6.9B (1.5B active)
- ARCHITECTURE
- Mixture of Experts
- CONTEXT LENGTH
- 128K tokens
- CAPABILITIES
- chat
- RELEASE DATE
- 2025-10-02
- PROVIDER
- IBM
- FAMILY
- granite
▸ VRAM REQUIREMENTS
| QUANT | BPW | VRAM | QUALITY |
|---|---|---|---|
| Q4_K_M | 4.89 | 4.7 GB | 94% |
| Q5_K_S | 5.57 | 5.3 GB | 96% |
| Q5_K_M | 5.7 | 5.4 GB | 96% |
| Q6_K | 6.56 | 6.1 GB | 97% |
| Q8_0 | 8.5 | 7.8 GB | 100% |
| FP16 | 16 | 14.3 GB | 100% |
§ 01BENCHMARK SCORES
HumanEval83.0
MMLU-PRO27.9
MATH23.8
IFEval81.4
BBH66.3
GPQA32.6
MUSR16.8
MBPP80.0
alpacaeval30.6
§ 02RUN COMMAND
Run granite-4.0-h-tiny 6.9B locally with Ollama — needs 4.7 GB VRAM at Q4_K_M:
$
ollama run granite:6b§ 03COMPATIBLE GPUs
30 @ Q4_K_MNVIDIA Tesla K20c
5 GB · 208 GB/s
NVIDIA Tesla K20m
5 GB · 208 GB/s
NVIDIA Tesla K20s
5 GB · 208 GB/s
NVIDIA GeForce GTX 1060 5 GB
5 GB · 160 GB/s
NVIDIA P102-100
5 GB · 440 GB/s
NVIDIA Quadro P2000
5 GB · 140.2 GB/s
NVIDIA Quadro P2200
5 GB · 200.2 GB/s
NVIDIA RTX 3050 6GB
6 GB · 168 GB/s
Intel Arc A380
6 GB · 186 GB/s
NVIDIA RTX 2060 6GB
6 GB · 336 GB/s
NVIDIA GTX 1660 SUPER
6 GB · 336 GB/s
NVIDIA GTX 1660 Ti
6 GB · 288 GB/s
NVIDIA GTX 1060 6GB
6 GB · 192 GB/s
NVIDIA Tesla C2070
6 GB · 143 GB/s
NVIDIA Tesla C2075
6 GB · 150 GB/s