Shanghai AI Lab/Dense

InternVL3 78B

Name: InternVL3 78B
Author: Shanghai AI Lab

We introduce InternVL3, an advanced multimodal large language model (MLLM) series that demonstrates superior overall performance.

chatvisionreasoning

78B

Parameters

32K

Context length

Benchmarks

Quantizations

Architecture

Dense

Released

2025-04-15

Layers

KV Heads

Head Dim

128

Family

other

Quantization Options

Quant	Bits	VRAM	Quality
IQ2_M	2.93	29.1 GB	low
Q2_K	3.16	31.3 GB	low
IQ3_XXS	3.25	32.2 GB	low
IQ3_XS	3.5	34.6 GB	low
Q3_K_S	3.64	36.0 GB	low
IQ3_M	3.76	37.1 GB	low
Q3_K_M	4	39.5 GB	low
Q3_K_L	4.3	42.4 GB	moderate
IQ4_XS	4.46	44.0 GB	moderate
Q4_K_S	4.67	46.0 GB	moderate
Q4_K_M	4.89	48.2 GB	good
Q5_K_S	5.57	54.8 GB	good
Q5_K_M	5.7	56.1 GB	good
Q6_K	6.56	64.4 GB	excellent
Q8_0	8.5	83.4 GB	lossless
FP16	16	156.5 GB	lossless

Select your GPU above to see speed estimates and compatibility for each quantization.

▸ READY TO RUN THIS?RENT BY THE HOUR

RENT A GPU AND RUN INTERNVL3 78B NOW

Rent on RunPod →Or Vast.ai →

Spin up an A100 / H100 / 4090 in ~60s. Pay by the second. Cancel anytime.

Community Ratings

Loading ratings...

Benchmarks (2)

MMBench89.0

MMMU72.2

Run this model

▸Easiest way to get started·Beginners

DOCS ↗

curl -fsSL https://ollama.com/install.sh | sh

$ollama run other:78b-q4_K_M

Tag may need adjustment — check ollama.com/library/other for available tags.

▸ SETUP GUIDE

Auto-setup with fitmyllm CLI

Detects your GPU, recommends the best model, downloads it, and starts chatting — zero config. Benchmarks your speed and contributes anonymous data to improve predictions.

pip install fitmyllmthen run fitmyllmLearn more

Auto-detect GPULive tok/s in chatSpeed benchmarks9 inference engines

HuggingFace GGUF Downloads Build Hardware