BigScience/Dense

Bbloom 176.2B

chat

176.2B

Parameters

4K

Context length

1

Benchmarks

17

Quantizations

8K

HF downloads

Architecture

Dense

Released

2022-07-06

Layers

70

KV Heads

112

Head Dim

128

Family

bloom

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_XXS	2.38	52.9 GB	low
IQ2_M	2.93	65.0 GB	low
Q2_K	3.16	70.1 GB	low
IQ3_XXS	3.25	72.1 GB	low
IQ3_XS	3.5	77.6 GB	low
Q3_K_S	3.64	80.7 GB	low
IQ3_M	3.76	83.3 GB	low
Q3_K_M	4	88.6 GB	low
Q3_K_L	4.3	95.2 GB	moderate
IQ4_XS	4.46	98.7 GB	moderate
Q4_K_S	4.67	103.3 GB	moderate
Q4_K_M	4.89	108.2 GB	good
Q5_K_S	5.57	123.2 GB	good
Q5_K_M	5.7	126.0 GB	good
Q6_K	6.56	145.0 GB	excellent
Q8_0	8.5	187.7 GB	lossless
FP16	16	352.9 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Benchmarks (1)

HumanEval15.0

Run this model

Easiest way to get starteddocs →

curl -fsSL https://ollama.com/install.sh | sh

$ollama run bloom:176b-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

HuggingFace Ollama Library GGUF Downloads Build Hardware

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

AMD Instinct MI300A

120 GB VRAM • 5300 GB/s

Apple M4 Max (128GB)

128 GB VRAM • 546 GB/s

AMD Instinct MI250X

128 GB VRAM • 3277 GB/s

Apple M1 Ultra (128GB)

128 GB VRAM • 800 GB/s

Apple M2 Ultra (128GB)

128 GB VRAM • 800 GB/s

AMD Radeon Instinct MI250

128 GB VRAM • 3280 GB/s

AMD Radeon Instinct MI250X

128 GB VRAM • 3280 GB/s

AMD Radeon Instinct MI300

128 GB VRAM • 6550 GB/s

Intel Data Center GPU Max 1550

128 GB VRAM • 3280 GB/s

Intel Data Center GPU Max Subsystem

128 GB VRAM • 3210 GB/s

128 GB VRAM • 273 GB/s

NVIDIA Jetson T5000

128 GB VRAM • 273 GB/s

Apple M5 Max (128GB)

128 GB VRAM • 614 GB/s

NVIDIA H200 SXM 141GB

140 GB VRAM • 4800 GB/s

NVIDIA H200 NVL

141 GB VRAM • 4890 GB/s

NVIDIA H200 SXM 141 GB

141 GB VRAM • 4890 GB/s

144 GB VRAM • 4100 GB/s

AMD Instinct MI300X

192 GB VRAM • 5300 GB/s

Apple M2 Ultra (192GB)

192 GB VRAM • 800 GB/s

Apple M3 Ultra (192GB)

192 GB VRAM • 800 GB/s

Apple M4 Ultra (192GB)

192 GB VRAM • 1092 GB/s

AMD Radeon Instinct MI300A

192 GB VRAM • 10300 GB/s

AMD Radeon Instinct MI300X

192 GB VRAM • 10300 GB/s

AMD Radeon Instinct MI308X

192 GB VRAM • 10300 GB/s

Apple M5 Ultra (192GB)

192 GB VRAM • 1228 GB/s

AMD Radeon Instinct MI325X

288 GB VRAM • 10300 GB/s

AMD Radeon Instinct MI350X

288 GB VRAM • 8190 GB/s

AMD Radeon Instinct MI355X

288 GB VRAM • 8190 GB/s

Apple M4 Ultra (384GB)

384 GB VRAM • 1092 GB/s

Apple M5 Ultra (384GB)

384 GB VRAM • 1228 GB/s

Find the best GPU for bloom 176.2B

Build Hardware for bloom 176.2B

bloom 176.2B — 176.2B Parameter Dense LLM

Model Specifications

Parameters: 176.2B
Architecture: Dense Transformer
Context Length: 4K tokens
Capabilities: chat
Release Date: 2022-07-06
Provider: BigScience
Family: bloom

VRAM Requirements

Quantization	BPW	VRAM	Quality
IQ2_XXS	2.38	52.9 GB	65%
IQ2_M	2.93	65.0 GB	75%
Q2_K	3.16	70.1 GB	78%
IQ3_XXS	3.25	72.1 GB	82%
IQ3_XS	3.5	77.6 GB	84%
Q3_K_S	3.64	80.7 GB	85%
IQ3_M	3.76	83.3 GB	86%
Q3_K_M	4	88.6 GB	88%
Q3_K_L	4.3	95.2 GB	90%
IQ4_XS	4.46	98.7 GB	92%
Q4_K_S	4.67	103.3 GB	93%
Q4_K_M	4.89	108.2 GB	94%
Q5_K_S	5.57	123.2 GB	96%
Q5_K_M	5.7	126.0 GB	96%
Q6_K	6.56	145.0 GB	97%
Q8_0	8.5	187.7 GB	100%
FP16	16	352.9 GB	100%

Benchmark Scores

HumanEval15.0

How to Run bloom 176.2B

Run bloom 176.2B locally with Ollama (needs 108.2 GB VRAM at Q4_K_M):

ollama run bloom:176b

Compatible GPUs (30)

GPUs that can run bloom 176.2B at Q4_K_M quantization:

AMD Instinct MI300A(120GB, 5300 GB/s)Apple M4 Max (128GB)(128GB, 546 GB/s)AMD Instinct MI250X(128GB, 3277 GB/s)Apple M1 Ultra (128GB)(128GB, 800 GB/s)Apple M2 Ultra (128GB)(128GB, 800 GB/s)AMD Radeon Instinct MI250(128GB, 3280 GB/s)AMD Radeon Instinct MI250X(128GB, 3280 GB/s)AMD Radeon Instinct MI300(128GB, 6550 GB/s)Intel Data Center GPU Max 1550(128GB, 3280 GB/s)Intel Data Center GPU Max Subsystem(128GB, 3210 GB/s)NVIDIA GB10(128GB, 273 GB/s)NVIDIA Jetson T5000(128GB, 273 GB/s)Apple M5 Max (128GB)(128GB, 614 GB/s)NVIDIA H200 SXM 141GB(140GB, 4800 GB/s)NVIDIA H200 NVL(141GB, 4890 GB/s)