Nemotron-4 340B
Model Card
View on HuggingFaceNemotron-4-340B-Instruct
Model Overview
Nemotron-4-340B-Instruct is a large language model (LLM) that can be used as part of a synthetic data generation pipeline to create training data that helps researchers and developers build their own LLMs. It is a fine-tuned version of the Nemotron-4-340B-Base model, optimized for English-based single and multi-turn chat use-cases. It supports a context length of 4,096 tokens.
Try this model on build.nvidia.com now.
The base model was pre-trained on a corpus of 9 trillion tokens consisting of a diverse assortment of English based texts, 50+ natural languages, and 40+ coding languages. Subsequently the Nemotron-4-340B-Instruct model went through additional alignment steps including:
- Supervised Fine-tuning (SFT)
- Direct Preference Optimization (DPO)
- Reward-aware Preference Optimization (RPO) (Additional in-house alignment technique)
Throughout the alignment process, we relied on only approximately 20K human-annotated data while our data generation pipeline synthesized over 98% of the data used for supervised fine-tuning and preference fine-tuning (DPO & RPO). We provide comprehensive details about our synthetic data generation pipeline in the technical report.
This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.
Under the NVIDIA Open Model License, NVIDIA confirms:
- Models are commercially usable.
- You are free to create and distribute Derivative Models.
- NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
Quantizations & VRAM
Benchmarks (5)
GPUs that can run this model
At Q4_K_M quantization. Sorted by minimum VRAM.
Find the best GPU for Nemotron-4 340B
Build Hardware for Nemotron-4 340B