Zhipu AI/Dense

GLM-5 753.9B

Name: GLM-5 753.9B
Author: Zhipu AI

chat

753.9B

Parameters

198K

Context length

Benchmarks

Quantizations

136K

HF downloads

Architecture

Dense

Released

2026-02-11

Layers

KV Heads

Head Dim

Family

glm

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_XXS	2.38	224.8 GB	low
IQ2_M	2.93	276.6 GB	low
Q2_K	3.16	298.3 GB	low
IQ3_XXS	3.25	306.8 GB	low
IQ3_XS	3.5	330.3 GB	low
Q3_K_S	3.64	343.5 GB	low
IQ3_M	3.76	354.8 GB	low
Q3_K_M	4	377.4 GB	low
Q3_K_L	4.3	405.7 GB	moderate
IQ4_XS	4.46	420.8 GB	moderate
Q4_K_S	4.67	440.6 GB	moderate
Q4_K_M	4.89	461.3 GB	good
Q5_K_S	5.57	525.4 GB	good
Q5_K_M	5.7	537.6 GB	good
Q6_K	6.56	618.7 GB	excellent
Q8_0	8.5	801.5 GB	lossless
FP16	16	1508.3 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

AIME92.7

GPQA Diamond86.0

SWE-bench77.8

HLE30.5

Easiest way to get starteddocs →

curl -fsSL https://ollama.com/install.sh | sh

$ollama run glm:753b-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Find the best GPU for GLM-5 753.9B