cogito/Dense

CCogito 70B

chatreasoningcodingtool_use
70B
Parameters
125K
Context length
0
Benchmarks
16
Quantizations
Architecture
Dense
Released
2025-04-01
Layers
80
KV Heads
8
Head Dim
128
Family
cogito

Quantization Options

QuantBitsVRAMQuality
IQ2_M2.9326.1 GBlow
Q2_K3.1628.1 GBlow
IQ3_XXS3.2528.9 GBlow
IQ3_XS3.531.1 GBlow
Q3_K_S3.6432.3 GBlow
IQ3_M3.7633.4 GBlow
Q3_K_M435.5 GBlow
Q3_K_L4.338.1 GBmoderate
IQ4_XS4.4639.5 GBmoderate
Q4_K_S4.6741.4 GBmoderate
Q4_K_M4.8943.3 GBgood
Q5_K_S5.5749.2 GBgood
Q5_K_M5.750.4 GBgood
Q6_K6.5657.9 GBexcellent
Q8_08.574.9 GBlossless
FP1616140.5 GBlossless

Select your GPU above to see speed estimates and compatibility for each quantization.

Run this model

Easiest way to get starteddocs →
curl -fsSL https://ollama.com/install.sh | sh
$ollama run cogito:70b:q4_k_m

Downloads and runs automatically. Add --verbose for speed stats.

Setup guide

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Apple M3 Max (48GB)
48 GB VRAM • 400 GB/s
APPLE
$2899
Apple M4 Pro (48GB)
48 GB VRAM • 273 GB/s
APPLE
$1799
Apple M4 Max (48GB)
48 GB VRAM • 546 GB/s
APPLE
$2499
NVIDIA L40S 48GB
48 GB VRAM • 864 GB/s
NVIDIA
$7500
NVIDIA L40 48GB
48 GB VRAM • 864 GB/s
NVIDIA
$5500
NVIDIA RTX 6000 Ada 48GB
48 GB VRAM • 960 GB/s
NVIDIA
$6800
NVIDIA A40 48GB
48 GB VRAM • 696 GB/s
NVIDIA
$4650
NVIDIA RTX A6000 48GB
48 GB VRAM • 768 GB/s
NVIDIA
$4650
NVIDIA Quadro RTX 8000
48 GB VRAM • 672 GB/s
NVIDIA
NVIDIA Quadro RTX 8000 Passive
48 GB VRAM • 624 GB/s
NVIDIA
NVIDIA A40 PCIe
48 GB VRAM • 696 GB/s
NVIDIA
NVIDIA RTX 6000 Ada Generation
48 GB VRAM • 960 GB/s
NVIDIA
$6800
NVIDIA L20
48 GB VRAM • 864 GB/s
NVIDIA
AMD Radeon PRO W7800 48 GB
48 GB VRAM • 864 GB/s
AMD
$3499
AMD Radeon PRO W7900
48 GB VRAM • 864 GB/s
AMD
$3999
Intel Data Center GPU Max 1100
48 GB VRAM • 1230 GB/s
INTEL
NVIDIA RTX 5880 Ada Generation
48 GB VRAM • 864 GB/s
NVIDIA
$5500
NVIDIA RTX PRO 5000 Blackwell
48 GB VRAM • 1340 GB/s
NVIDIA
$4999
AMD Radeon PRO W7900D
48 GB VRAM • 864 GB/s
AMD
$3999
NVIDIA GRID A100B
48 GB VRAM • 1870 GB/s
NVIDIA
NVIDIA RTX A6000
48 GB VRAM • 768 GB/s
NVIDIA
$4650
NVIDIA L40
48 GB VRAM • 864 GB/s
NVIDIA
$7000
NVIDIA L40S
48 GB VRAM • 864 GB/s
NVIDIA
$8000
Apple M5 Pro (48GB)
48 GB VRAM • 200 GB/s
APPLE
Apple M5 Max (48GB)
48 GB VRAM • 614 GB/s
APPLE
Apple M1 Ultra (64GB)
64 GB VRAM • 800 GB/s
APPLE
$2499
Apple M2 Ultra (64GB)
64 GB VRAM • 800 GB/s
APPLE
$2999
Apple M4 Max (64GB)
64 GB VRAM • 546 GB/s
APPLE
$2899
Apple M2 Max (64GB)
64 GB VRAM • 400 GB/s
APPLE
$2299
Apple M3 Max (64GB)
64 GB VRAM • 300 GB/s
APPLE
$2799

Find the best GPU for Cogito 70B

Build Hardware for Cogito 70B

Cogito 70B70B Parameter Dense LLM

Model Specifications

Parameters
70B
Architecture
Dense Transformer
Context Length
125K tokens
Capabilities
chat, reasoning, coding, tool_use
Release Date
2025-04-01
Family
cogito

VRAM Requirements

QuantizationBPWVRAMQuality
IQ2_M2.9326.1 GB75%
Q2_K3.1628.1 GB78%
IQ3_XXS3.2528.9 GB82%
IQ3_XS3.531.1 GB84%
Q3_K_S3.6432.3 GB85%
IQ3_M3.7633.4 GB86%
Q3_K_M435.5 GB88%
Q3_K_L4.338.1 GB90%
IQ4_XS4.4639.5 GB92%
Q4_K_S4.6741.4 GB93%
Q4_K_M4.8943.3 GB94%
Q5_K_S5.5749.2 GB96%
Q5_K_M5.750.4 GB96%
Q6_K6.5657.9 GB97%
Q8_08.574.9 GB100%
FP1616140.5 GB100%

How to Run Cogito 70B

Run Cogito 70B locally with Ollama (needs 43.3 GB VRAM at Q4_K_M):

ollama run cogito:70b

Compatible GPUs (30)

GPUs that can run Cogito 70B at Q4_K_M quantization: