Google/Dense

GoogleGemma 3 4B

Gemma 3 4B — Google's latest with vision and reasoning. Punches above its weight.

chatreasoningvisionThinking
4.3B
Parameters
128K
Context length
21
Benchmarks
6
Quantizations
1.0M
HF downloads
Architecture
Dense
Released
2025-03-12
Layers
34
KV Heads
4
Head Dim
256
Family
gemma

Quantization Options

QuantBitsVRAMQuality
Q4_K_M4.893.1 GBgood
Q5_K_S5.573.5 GBgood
Q5_K_M5.73.6 GBgood
Q6_K6.564.0 GBexcellent
Q8_08.55.1 GBlossless
FP16169.1 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN GEMMA 3 4B NOW

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (21)

Arena Elo1291
MATH-50076.6
IFEval69.5
MMMU48.8
MATH42.5
MMLU-PRO36.2
BBH34.2
GPQA Diamond29.1
IFBench28.3
AIME12.7
AA Math12.7
MUSR11.4
LiveCodeBench11.2
GPQA10.1
SciCode7.3
AA Intelligence6.3
AA Long Context5.7
HLE5.2
τ²-Bench5.0
AA Coding2.9
Terminal-Bench0.8

Run this model

Easiest way to get started·Beginners
DOCS ↗
curl -fsSL https://ollama.com/install.sh | sh
$ollama run gemma3:4b-q4_K_M

Downloads and runs automatically. Add --verbose for speed stats.

▸ SETUP GUIDE
>_

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more
Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

NVIDIA Tesla C1080
4 GB VRAM • 102 GB/s
NVIDIA
NVIDIA Tesla K10
4 GB VRAM • 160 GB/s
NVIDIA
NVIDIA Tesla M4
4 GB VRAM • 88 GB/s
NVIDIA
AMD Radeon Instinct MI8
4 GB VRAM • 512 GB/s
AMD
NVIDIA RTX A1000 Embedded
4 GB VRAM • 224 GB/s
NVIDIA
NVIDIA RTX A2000 Embedded
4 GB VRAM • 192 GB/s
NVIDIA
Intel Arc A310
4 GB VRAM • 124 GB/s
INTEL
$79
Intel Arc A350
4 GB VRAM • 124 GB/s
INTEL
$99

Find the best GPU for Gemma 3 4B

Build Hardware for Gemma 3 4B

Gemma 3 4B — Google's latest with vision and reasoning. Punches above its weight.

▸ SPEC SHEET

Gemma 3 4B4.3B Dense.

▸ SPECIFICATIONS
PARAMETERS
4.3B
ARCHITECTURE
Dense Transformer
CONTEXT LENGTH
128K tokens
CAPABILITIES
chat, reasoning, vision
RELEASE DATE
2025-03-12
PROVIDER
Google
FAMILY
gemma
▸ VRAM REQUIREMENTS
QUANTBPWVRAMQUALITY
Q4_K_M4.893.1 GB94%
Q5_K_S5.573.5 GB96%
Q5_K_M5.73.6 GB96%
Q6_K6.564.0 GB97%
Q8_08.55.1 GB100%
FP16169.1 GB100%
§ 01BENCHMARK SCORES
MMLU-PRO36.2
MATH42.5
IFEval69.5
BBH34.2
MMMU48.8
GPQA10.1
MUSR11.4
Arena Elo1291.0
GPQA Diamond29.1
LiveCodeBench11.2
AIME12.7
MATH-50076.6
HLE5.2
AA Intelligence6.3
AA Coding2.9
AA Math12.7
aa_ifbench28.3
aa_terminal_bench0.8
aa_tau25.0
aa_scicode7.3
aa_lcr5.7
§ 02RUN COMMAND

Run Gemma 3 4B locally with Ollama — needs 3.1 GB VRAM at Q4_K_M:

$ollama run gemma3:4b