NVIDIA/Dense

Nemotron-4 340B

chatreasoning
340B
Parameters
4K
Context length
5
Benchmarks
4
Quantizations
10K
HF downloads
Architecture
Dense
Released
2024-06-14
Layers
96
KV Heads
8
Head Dim
128
Family
nemotron

Nemotron-4-340B-Instruct

Model Overview

Nemotron-4-340B-Instruct is a large language model (LLM) that can be used as part of a synthetic data generation pipeline to create training data that helps researchers and developers build their own LLMs. It is a fine-tuned version of the Nemotron-4-340B-Base model, optimized for English-based single and multi-turn chat use-cases. It supports a context length of 4,096 tokens.

Try this model on build.nvidia.com now.

The base model was pre-trained on a corpus of 9 trillion tokens consisting of a diverse assortment of English based texts, 50+ natural languages, and 40+ coding languages. Subsequently the Nemotron-4-340B-Instruct model went through additional alignment steps including:

Throughout the alignment process, we relied on only approximately 20K human-annotated data while our data generation pipeline synthesized over 98% of the data used for supervised fine-tuning and preference fine-tuning (DPO & RPO). We provide comprehensive details about our synthetic data generation pipeline in the technical report.

This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.

Under the NVIDIA Open Model License, NVIDIA confirms:

  • Models are commercially usable.
  • You are free to create and distribute Derivative Models.
  • NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.

Quantizations & VRAM

Q4_K_M4.5 bpw
194.2 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
279.2 GB
VRAM required
97%
Quality
Q8_08 bpw
342.9 GB
VRAM required
100%
Quality
FP1616 bpw
682.9 GB
VRAM required
100%
Quality

Benchmarks (5)

Arena Elo1482
IFEval85.1
HumanEval73.2
MMLU-PRO62.0
MATH41.1

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for Nemotron-4 340B

Build Hardware for Nemotron-4 340B