MiniMax/Mixture of Experts

MiniMaxMiniMax-M2.5 228.7B

chat
228.7B
Parameters (21B active)
192K
Context length
7
Benchmarks
17
Quantizations
493K
HF downloads
Architecture
MoE
Released
2026-03-10
Layers
62
KV Heads
8
Head Dim
128
Family
minimax

Quantization Options

QuantBitsVRAMQuality
IQ2_XXS2.3868.5 GBlow
IQ2_M2.9384.3 GBlow
Q2_K3.1690.8 GBlow
IQ3_XXS3.2593.4 GBlow
IQ3_XS3.5100.5 GBlow
Q3_K_S3.64104.5 GBlow
IQ3_M3.76108.0 GBlow
Q3_K_M4114.8 GBlow
Q3_K_L4.3123.4 GBmoderate
IQ4_XS4.46128.0 GBmoderate
Q4_K_S4.67134.0 GBmoderate
Q4_K_M4.89140.3 GBgood
Q5_K_S5.57159.7 GBgood
Q5_K_M5.7163.4 GBgood
Q6_K6.56188.0 GBexcellent
Q8_08.5243.5 GBlossless
FP1616457.9 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Benchmarks (7)

Arena Elo1495
MATH86.3
GPQA85.2
GPQA Diamond84.8
AA Intelligence41.9
AA Coding37.4
HLE19.1

Run this model

Easiest way to get starteddocs →
curl -fsSL https://ollama.com/install.sh | sh
$ollama run minimax:228b-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Setup guide

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for MiniMax-M2.5 228.7B

Build Hardware for MiniMax-M2.5 228.7B

MiniMax-M2.5 228.7B228.7B Parameter Mixture of Experts LLM

Model Specifications

Parameters
228.7B (21B active)
Architecture
Mixture of Experts
Context Length
192K tokens
Capabilities
chat
Release Date
2026-03-10
Provider
MiniMax
Family
minimax

VRAM Requirements

QuantizationBPWVRAMQuality
IQ2_XXS2.3868.5 GB65%
IQ2_M2.9384.3 GB75%
Q2_K3.1690.8 GB78%
IQ3_XXS3.2593.4 GB82%
IQ3_XS3.5100.5 GB84%
Q3_K_S3.64104.5 GB85%
IQ3_M3.76108.0 GB86%
Q3_K_M4114.8 GB88%
Q3_K_L4.3123.4 GB90%
IQ4_XS4.46128.0 GB92%
Q4_K_S4.67134.0 GB93%
Q4_K_M4.89140.3 GB94%
Q5_K_S5.57159.7 GB96%
Q5_K_M5.7163.4 GB96%
Q6_K6.56188.0 GB97%
Q8_08.5243.5 GB100%
FP1616457.9 GB100%

Benchmark Scores

MATH86.3
GPQA85.2
Arena Elo1495.0
GPQA Diamond84.8
HLE19.1
AA Intelligence41.9
AA Coding37.4

How to Run MiniMax-M2.5 228.7B

Run MiniMax-M2.5 228.7B locally with Ollama (needs 140.3 GB VRAM at Q4_K_M):

ollama run minimax:228b

Compatible GPUs (16)

GPUs that can run MiniMax-M2.5 228.7B at Q4_K_M quantization: