DeepSeek/Mixture of Experts

DeepSeek-V3 684.5B

Name: DeepSeek-V3 684.5B
Author: DeepSeek

chat

684.5B

Parameters (37B active)

160K

Context length

Benchmarks

Quantizations

928K

HF downloads

Architecture

MoE

Released

2025-01-20

Layers

KV Heads

128

Head Dim

Family

deepseek

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_XXS	2.38	204.1 GB	low
IQ2_M	2.93	251.2 GB	low
Q2_K	3.16	270.9 GB	low
IQ3_XXS	3.25	278.6 GB	low
IQ3_XS	3.5	300.0 GB	low
Q3_K_S	3.64	311.9 GB	low
IQ3_M	3.76	322.2 GB	low
Q3_K_M	4	342.7 GB	low
Q3_K_L	4.3	368.4 GB	moderate
IQ4_XS	4.46	382.1 GB	moderate
Q4_K_S	4.67	400.1 GB	moderate
Q4_K_M	4.89	418.9 GB	good
Q5_K_S	5.57	477.1 GB	good
Q5_K_M	5.7	488.2 GB	good
Q6_K	6.56	561.8 GB	excellent
Q8_0	8.5	727.8 GB	lossless
FP16	16	1369.5 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Arena Elo1373

BigCodeBench50.0

Easiest way to get starteddocs →

curl -fsSL https://ollama.com/install.sh | sh

$ollama run deepseek-v3:684.5b-instruct-q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Find the best GPU for DeepSeek-V3 684.5B