Allen Institute/Dense

Allen InstituteOLMo-2-0325-32B

chat
32.2B
Parameters
4K
Context length
15
Benchmarks
14
Quantizations
7K
HF downloads
Architecture
Dense
Released
2025-06-15
Layers
64
KV Heads
8
Head Dim
128
Family
olmo

Quantization Options

QuantBitsVRAMQuality
IQ3_XXS3.2513.6 GBlow
IQ3_XS3.514.6 GBlow
Q3_K_S3.6415.1 GBlow
IQ3_M3.7615.6 GBlow
Q3_K_M416.6 GBlow
Q3_K_L4.317.8 GBmoderate
IQ4_XS4.4618.4 GBmoderate
Q4_K_S4.6719.3 GBmoderate
Q4_K_M4.8920.2 GBgood
Q5_K_S5.5722.9 GBgood
Q5_K_M5.723.4 GBgood
Q6_K6.5626.9 GBexcellent
Q8_08.534.7 GBlossless
FP161664.9 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Benchmarks (15)

Arena Elo1222
MATH93.4
IFEval88.8
HumanEval86.7
BBH84.0
AIME77.3
AA Math77.3
LiveCodeBench69.5
MBPP65.1
alpacaeval59.8
GPQA Diamond53.9
GPQA48.6
AA Intelligence12.2
AA Coding5.6
HLE4.9

Run this model

Easiest way to get starteddocs →
curl -fsSL https://ollama.com/install.sh | sh
$ollama run olmo:32b-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Setup guide

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for OLMo-2-0325-32B

Build Hardware for OLMo-2-0325-32B

OLMo-2-0325-32B32.2B Parameter Dense LLM

Model Specifications

Parameters
32.2B
Architecture
Dense Transformer
Context Length
4K tokens
Capabilities
chat
Release Date
2025-06-15
Provider
Allen Institute
Family
olmo

VRAM Requirements

QuantizationBPWVRAMQuality
IQ3_XXS3.2513.6 GB82%
IQ3_XS3.514.6 GB84%
Q3_K_S3.6415.1 GB85%
IQ3_M3.7615.6 GB86%
Q3_K_M416.6 GB88%
Q3_K_L4.317.8 GB90%
IQ4_XS4.4618.4 GB92%
Q4_K_S4.6719.3 GB93%
Q4_K_M4.8920.2 GB94%
Q5_K_S5.5722.9 GB96%
Q5_K_M5.723.4 GB96%
Q6_K6.5626.9 GB97%
Q8_08.534.7 GB100%
FP161664.9 GB100%

Benchmark Scores

HumanEval86.7
MATH93.4
IFEval88.8
BBH84.0
GPQA48.6
MBPP65.1
alpacaeval59.8
Arena Elo1222.0
GPQA Diamond53.9
HLE4.9
AA Intelligence12.2
AA Coding5.6
LiveCodeBench69.5
AIME77.3
AA Math77.3

How to Run OLMo-2-0325-32B

Run OLMo-2-0325-32B locally with Ollama (needs 20.2 GB VRAM at Q4_K_M):

ollama run olmo:32b

Compatible GPUs (30)

GPUs that can run OLMo-2-0325-32B at Q4_K_M quantization: