Zhipu AI/Mixture of Experts

Zhipu AIGLM-5.1 754B

๐Ÿ‘‹ Join our WeChat or Discord community. ๐Ÿ“– Check out the GLM-5.1 blog and GLM-5 Technical report. ๐Ÿ“ Use GLM-5.1 API services on Z.ai API Platform. ๐Ÿ”œ GLM-5.1 will be available on chat.z.ai in the coming days.

chatcodingreasoningmultilingualmathagentictool_use
754B
Parameters (37B active)
198K
Context length
2
Benchmarks
17
Quantizations
0
Architecture
MoE
Released
2026-04-07
Layers
78
KV Heads
64
Head Dim
64
Family
glm

Quantization Options

QuantBitsVRAMQuality
IQ2_XXS2.38224.8 GBlow
IQ2_M2.93276.6 GBlow
Q2_K3.16298.3 GBlow
IQ3_XXS3.25306.8 GBlow
IQ3_XS3.5330.4 GBlow
Q3_K_S3.64343.6 GBlow
IQ3_M3.76354.9 GBlow
Q3_K_M4377.5 GBlow
Q3_K_L4.3405.8 GBmoderate
IQ4_XS4.46420.8 GBmoderate
Q4_K_S4.67440.6 GBmoderate
Q4_K_M4.89461.4 GBgood
Q5_K_S5.57525.5 GBgood
Q5_K_M5.7537.7 GBgood
Q6_K6.56618.8 GBexcellent
Q8_08.5801.6 GBlossless
FP16161508.5 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

โ–ธ READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN GLM-5.1 754B NOW

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (2)

GPQA Diamond83.9
HLE25.6

Run this model

โ–ธEasiest way to get startedยทBeginners
DOCS โ†—
curl -fsSL https://ollama.com/install.sh | sh
$ollama run glm:754b-q4_K_M

Tag may need adjustment โ€” check ollama.com/library/glm for available tags.

โ–ธ SETUP GUIDE
>_

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting โ€” zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more
Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

Find the best GPU for GLM-5.1 754B

Build Hardware for GLM-5.1 754B
โ–ธ SPEC SHEET

GLM-5.1 754B โ€” 754B MoE.

โ–ธ SPECIFICATIONS
PARAMETERS
754B (37B active)
ARCHITECTURE
Mixture of Experts
CONTEXT LENGTH
198K tokens
CAPABILITIES
chat, coding, reasoning, multilingual, math, agentic, tool_use
RELEASE DATE
2026-04-07
PROVIDER
Zhipu AI
FAMILY
glm
โ–ธ VRAM REQUIREMENTS
QUANTBPWVRAMQUALITY
IQ2_XXS2.38224.8 GB65%
IQ2_M2.93276.6 GB75%
Q2_K3.16298.3 GB78%
IQ3_XXS3.25306.8 GB82%
IQ3_XS3.5330.4 GB84%
Q3_K_S3.64343.6 GB85%
IQ3_M3.76354.9 GB86%
Q3_K_M4377.5 GB88%
Q3_K_L4.3405.8 GB90%
IQ4_XS4.46420.8 GB92%
Q4_K_S4.67440.6 GB93%
Q4_K_M4.89461.4 GB94%
Q5_K_S5.57525.5 GB96%
Q5_K_M5.7537.7 GB96%
Q6_K6.56618.8 GB97%
Q8_08.5801.6 GB100%
FP16161508.5 GB100%
ยง 01BENCHMARK SCORES
GPQA Diamond83.9
HLE25.6