BigScience/Dense

Bbloom 176.2B

chat
176.2B
Parameters
4K
Context length
1
Benchmarks
17
Quantizations
8K
HF downloads
Architecture
Dense
Released
2022-07-06
Layers
70
KV Heads
112
Head Dim
128
Family
bloom

Quantization Options

QuantBitsVRAMQuality
IQ2_XXS2.3852.9 GBlow
IQ2_M2.9365.0 GBlow
Q2_K3.1670.1 GBlow
IQ3_XXS3.2572.1 GBlow
IQ3_XS3.577.6 GBlow
Q3_K_S3.6480.7 GBlow
IQ3_M3.7683.3 GBlow
Q3_K_M488.6 GBlow
Q3_K_L4.395.2 GBmoderate
IQ4_XS4.4698.7 GBmoderate
Q4_K_S4.67103.3 GBmoderate
Q4_K_M4.89108.2 GBgood
Q5_K_S5.57123.2 GBgood
Q5_K_M5.7126.0 GBgood
Q6_K6.56145.0 GBexcellent
Q8_08.5187.7 GBlossless
FP1616352.9 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Benchmarks (1)

HumanEval15.0

Run this model

Easiest way to get starteddocs →
curl -fsSL https://ollama.com/install.sh | sh
$ollama run bloom:176b-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Setup guide

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

AMD Instinct MI300A
120 GB VRAM • 5300 GB/s
AMD
$12000
Apple M4 Max (128GB)
128 GB VRAM • 546 GB/s
APPLE
$3999
AMD Instinct MI250X
128 GB VRAM • 3277 GB/s
AMD
$10000
Apple M1 Ultra (128GB)
128 GB VRAM • 800 GB/s
APPLE
$4999
Apple M2 Ultra (128GB)
128 GB VRAM • 800 GB/s
APPLE
$3999
AMD Radeon Instinct MI250
128 GB VRAM • 3280 GB/s
AMD
$12000
AMD Radeon Instinct MI250X
128 GB VRAM • 3280 GB/s
AMD
$15000
AMD Radeon Instinct MI300
128 GB VRAM • 6550 GB/s
AMD
$12000
Intel Data Center GPU Max 1550
128 GB VRAM • 3280 GB/s
INTEL
Intel Data Center GPU Max Subsystem
128 GB VRAM • 3210 GB/s
INTEL
NVIDIA GB10
128 GB VRAM • 273 GB/s
NVIDIA
NVIDIA Jetson T5000
128 GB VRAM • 273 GB/s
NVIDIA
Apple M5 Max (128GB)
128 GB VRAM • 614 GB/s
APPLE
NVIDIA H200 SXM 141GB
140 GB VRAM • 4800 GB/s
NVIDIA
$30000
NVIDIA H200 NVL
141 GB VRAM • 4890 GB/s
NVIDIA
$35000
NVIDIA H200 SXM 141 GB
141 GB VRAM • 4890 GB/s
NVIDIA
$30000
NVIDIA B300
144 GB VRAM • 4100 GB/s
NVIDIA
$35000
AMD Instinct MI300X
192 GB VRAM • 5300 GB/s
AMD
$15000
Apple M2 Ultra (192GB)
192 GB VRAM • 800 GB/s
APPLE
$5499
Apple M3 Ultra (192GB)
192 GB VRAM • 800 GB/s
APPLE
$6999
Apple M4 Ultra (192GB)
192 GB VRAM • 1092 GB/s
APPLE
$7499
AMD Radeon Instinct MI300A
192 GB VRAM • 10300 GB/s
AMD
$12000
AMD Radeon Instinct MI300X
192 GB VRAM • 10300 GB/s
AMD
$15000
AMD Radeon Instinct MI308X
192 GB VRAM • 10300 GB/s
AMD
$12000
Apple M5 Ultra (192GB)
192 GB VRAM • 1228 GB/s
APPLE
AMD Radeon Instinct MI325X
288 GB VRAM • 10300 GB/s
AMD
$20000
AMD Radeon Instinct MI350X
288 GB VRAM • 8190 GB/s
AMD
$25000
AMD Radeon Instinct MI355X
288 GB VRAM • 8190 GB/s
AMD
$30000
Apple M4 Ultra (384GB)
384 GB VRAM • 1092 GB/s
APPLE
$9999
Apple M5 Ultra (384GB)
384 GB VRAM • 1228 GB/s
APPLE

Find the best GPU for bloom 176.2B

Build Hardware for bloom 176.2B

bloom 176.2B176.2B Parameter Dense LLM

Model Specifications

Parameters
176.2B
Architecture
Dense Transformer
Context Length
4K tokens
Capabilities
chat
Release Date
2022-07-06
Provider
BigScience
Family
bloom

VRAM Requirements

QuantizationBPWVRAMQuality
IQ2_XXS2.3852.9 GB65%
IQ2_M2.9365.0 GB75%
Q2_K3.1670.1 GB78%
IQ3_XXS3.2572.1 GB82%
IQ3_XS3.577.6 GB84%
Q3_K_S3.6480.7 GB85%
IQ3_M3.7683.3 GB86%
Q3_K_M488.6 GB88%
Q3_K_L4.395.2 GB90%
IQ4_XS4.4698.7 GB92%
Q4_K_S4.67103.3 GB93%
Q4_K_M4.89108.2 GB94%
Q5_K_S5.57123.2 GB96%
Q5_K_M5.7126.0 GB96%
Q6_K6.56145.0 GB97%
Q8_08.5187.7 GB100%
FP1616352.9 GB100%

Benchmark Scores

HumanEval15.0

How to Run bloom 176.2B

Run bloom 176.2B locally with Ollama (needs 108.2 GB VRAM at Q4_K_M):

ollama run bloom:176b

Compatible GPUs (30)

GPUs that can run bloom 176.2B at Q4_K_M quantization: