exaone/Dense

EEXAONE Deep 32B

Name: EXAONE Deep 32B
Author: exaone

We introduce EXAONE Deep, which exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.

reasoningmathcoding

32B

Parameters

32K

Context length

Benchmarks

Quantizations

Architecture

Dense

Released

2025-03-16

Layers

KV Heads

Head Dim

102

Family

exaone

Quantization Options

Quant	Bits	VRAM	Quality
IQ3_XXS	3.25	13.5 GB	low
IQ3_XS	3.5	14.5 GB	low
Q3_K_S	3.64	15.0 GB	low
IQ3_M	3.76	15.5 GB	low
Q3_K_M	4	16.5 GB	low
Q3_K_L	4.3	17.7 GB	moderate
IQ4_XS	4.46	18.3 GB	moderate
Q4_K_S	4.67	19.2 GB	moderate
Q4_K_M	4.89	20.0 GB	good
Q5_K_S	5.57	22.8 GB	good
Q5_K_M	5.7	23.3 GB	good
Q6_K	6.56	26.7 GB	excellent
Q8_0	8.5	34.5 GB	lossless
FP16	16	64.5 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

▸ READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN EXAONE DEEP 32B NOW

Rent on RunPod →Or Vast.ai →

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (10)

MATH-50095.7

IFEval83.9

AIME72.1

GPQA Diamond66.1

LiveCodeBench59.5

MATH51.3

MMLU-PRO40.4

BBH39.8

MUSR5.2

GPQA5.0

Run this model

▸Easiest way to get started·Beginners

DOCS ↗

curl -fsSL https://ollama.com/install.sh | sh

$ollama run exaone-deep:32b-q4_K_M

Downloads and runs automatically. Add --verbose for speed stats.

▸ SETUP GUIDE

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more

Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

HuggingFace Ollama Library GGUF Downloads Build Hardware

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

NVIDIA RTX 4090

24 GB VRAM • 1008 GB/s

NVIDIA

$1599