Cognitive Computations/Dense

Dolphin Llama 3 70B

chatTool Use
70B
Parameters
8K
Context length
7
Benchmarks
4
Quantizations
0
Architecture
Dense
Released
2024-04-20
Layers
80
KV Heads
8
Head Dim
128
Family
llama

Dolphin 2.9 Llama 3 70b 🐬

Curated and trained by Eric Hartford, Lucas Atkins, Fernando Fernandes, and with help from the community of Cognitive Computations

Discord: https://discord.gg/cognitivecomputations

A bug has been found in the Dolphin 2.9 dataset in SystemConversations that causes the model to overly talk about the "SYSTEM MESSAGE". To counter this, we recommend you add a statement in the system message directing the model not to mention the system message. An example system message is "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it."

Our appreciation for the sponsors of Dolphin 2.9:

This model is based on Llama-3-70b, and is governed by META LLAMA 3 COMMUNITY LICENSE AGREEMENT

The base model has 8k context, and the qLoRA fine-tuning was with 8k sequence length.

It took 2.5 days on 8xH100 node provided by Crusoe Cloud

This model uses ChatML prompt template format.

example:

<|im_start|>system
You are Dolphin, a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.

Dolphin is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.

Dolphin is licensed according to Meta's Llama license. I grant permission for any use, including commercial, that falls within accordance with Meta's Llama-3 license. Dolphin was trained on data generated from GPT4, among other models.

Evals

Quants

Quantizations & VRAM

Q4_K_M4.5 bpw
39.9 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
57.4 GB
VRAM required
97%
Quality
Q8_08 bpw
70.5 GB
VRAM required
100%
Quality
FP1616 bpw
140.5 GB
VRAM required
100%
Quality

Benchmarks (7)

IFEval90.0
BBH56.6
HumanEval50.6
MATH48.3
MMLU-PRO48.1
MUSR15.6
GPQA10.5

Run with Ollama

$ollama run dolphin-llama3:70b

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

NVIDIA A100 PCIe 40GB
40 GB VRAM • 1555 GB/s
NVIDIA
$10000
NVIDIA A100 PCIe 40 GB
40 GB VRAM • 1560 GB/s
NVIDIA
NVIDIA A100 SXM4 40 GB
40 GB VRAM • 1560 GB/s
NVIDIA
NVIDIA A800 PCIe 40 GB
40 GB VRAM • 1560 GB/s
NVIDIA
Apple M3 Max (48GB)
48 GB VRAM • 400 GB/s
APPLE
$2899
Apple M4 Pro (48GB)
48 GB VRAM • 273 GB/s
APPLE
$1799
Apple M4 Max (48GB)
48 GB VRAM • 546 GB/s
APPLE
$2499
NVIDIA L40S 48GB
48 GB VRAM • 864 GB/s
NVIDIA
$7500
NVIDIA L40 48GB
48 GB VRAM • 864 GB/s
NVIDIA
$5500
NVIDIA RTX 6000 Ada 48GB
48 GB VRAM • 960 GB/s
NVIDIA
$6800
NVIDIA A40 48GB
48 GB VRAM • 696 GB/s
NVIDIA
$4650
NVIDIA RTX A6000 48GB
48 GB VRAM • 768 GB/s
NVIDIA
$4650
NVIDIA Quadro RTX 8000
48 GB VRAM • 672 GB/s
NVIDIA
NVIDIA Quadro RTX 8000 Passive
48 GB VRAM • 624 GB/s
NVIDIA
NVIDIA A40 PCIe
48 GB VRAM • 696 GB/s
NVIDIA
NVIDIA RTX 6000 Ada Generation
48 GB VRAM • 960 GB/s
NVIDIA
NVIDIA L20
48 GB VRAM • 864 GB/s
NVIDIA
AMD Radeon PRO W7800 48 GB
48 GB VRAM • 864 GB/s
AMD
AMD Radeon PRO W7900
48 GB VRAM • 864 GB/s
AMD
Intel Data Center GPU Max 1100
48 GB VRAM • 1230 GB/s
INTEL
NVIDIA RTX 5880 Ada Generation
48 GB VRAM • 864 GB/s
NVIDIA
NVIDIA RTX PRO 5000 Blackwell
48 GB VRAM • 1340 GB/s
NVIDIA
AMD Radeon PRO W7900D
48 GB VRAM • 864 GB/s
AMD
Apple M1 Ultra (64GB)
64 GB VRAM • 800 GB/s
APPLE
$2499
Apple M2 Ultra (64GB)
64 GB VRAM • 800 GB/s
APPLE
$2999
Apple M4 Max (64GB)
64 GB VRAM • 546 GB/s
APPLE
$2899
Apple M2 Max (64GB)
64 GB VRAM • 400 GB/s
APPLE
$2299
Apple M3 Max (64GB)
64 GB VRAM • 300 GB/s
APPLE
$2799
Apple M4 Pro (64GB)
64 GB VRAM • 273 GB/s
APPLE
$2599
AMD Radeon Instinct MI200
64 GB VRAM • 1640 GB/s
AMD

Find the best GPU for Dolphin Llama 3 70B

Build Hardware for Dolphin Llama 3 70B