HuggingFace/Dense

HuggingFaceSmolVLM 500M

SmolVLM-500M is a tiny multimodal model, member of the SmolVLM family. It accepts arbitrary sequences of image and text inputs to produce text outputs. It's designed for efficiency.

chatvision
0.5B
Parameters
8K
Context length
0
Benchmarks
6
Quantizations
0
Architecture
Dense
Released
2024-11-26
Layers
32
KV Heads
5
Head Dim
64
Family
smollm

Quantization Options

QuantBitsVRAMQuality
Q4_K_M4.890.8 GBgood
Q5_K_S5.570.8 GBgood
Q5_K_M5.70.8 GBgood
Q6_K6.560.9 GBexcellent
Q8_08.51.0 GBlossless
FP16161.5 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN SMOLVLM 500M NOW

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Run this model

Easiest way to get started·Beginners
DOCS ↗
curl -fsSL https://ollama.com/install.sh | sh
$ollama run smollm:0.5b-q4_K_M

Tag may need adjustment — check ollama.com/library/smollm for available tags.

▸ SETUP GUIDE
>_

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more
Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for SmolVLM 500M

Build Hardware for SmolVLM 500M
▸ SPEC SHEET

SmolVLM 500M0.5B Dense.

▸ SPECIFICATIONS
PARAMETERS
0.5B
ARCHITECTURE
Dense Transformer
CONTEXT LENGTH
8K tokens
CAPABILITIES
chat, vision
RELEASE DATE
2024-11-26
PROVIDER
HuggingFace
FAMILY
smollm
▸ VRAM REQUIREMENTS
QUANTBPWVRAMQUALITY
Q4_K_M4.890.8 GB94%
Q5_K_S5.570.8 GB96%
Q5_K_M5.70.8 GB96%
Q6_K6.560.9 GB97%
Q8_08.51.0 GB100%
FP16161.5 GB100%