Mistral AI/Dense

Ministral 3 14B

chatcodingreasoningvisionThinkingTool Use
14B
Parameters
256K
Context length
13
Benchmarks
4
Quantizations
Architecture
Dense
Released
2025-12-02
Layers
40
KV Heads
8
Head Dim
128
Family
mistral

Ministral 3 14B Instruct 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

This model is the instruct post-trained version in FP8, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 24GB of VRAM in FP8, and less if further quantized.

Learn more in our blog post and paper.

Key Features

Ministral 3 14B consists of two main architectural components:

  • 13.5B Language Model
  • 0.4B Vision Encoder

The Ministral 3 14B Instruct model offers the following capabilities:

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Use Cases

Private AI deployments where advanced capabilities meet practical hardware constraints:

  • Private/custom chat and AI assistant deployments in constrained environments
  • Advanced local agentic use cases
  • Fine-tuning and specialization
  • And more...

Bringing advanced AI capabilities to most environments.

Recommended Settings

We recommend deploying with the following best practices:

  • System Prompt: Define a clear environment and use case, including guidance on how to effectively leverage tools in agentic systems.
  • Sampling Parameters: Use a temperature below 0.1 for daily-driver and production environments ; Higher temperatures may be explored for creative use cases - developers are encouraged to experiment with alternative settings.
  • Tools: Keep the set of tools well-defined and limit their number to the minimum required for the use case - Avoiding overloading the model with an excessive number of tools.
  • Vision: When deploying with vision capabilities, we recommend maintaining an aspect ratio close to 1:1 (width-to-height) for images. Avoiding the use of overly thin or wide images - crop them as needed to ensure optimal performance.

Ministral 3 Family

Model NameTypePrecisionLink
Ministral 3 3B Base 2512Base pre-trainedBF16Hugging Face
Ministral 3 3B Instruct 2512Instruct post-trainedFP8Hugging Face
Ministral 3 3B Reasoning 2512Reasoning capableBF16Hugging Face
Ministral 3 8B Base 2512Base pre-trainedBF16Hugging Face
Ministral 3 8B Instruct 2512Instruct post-trainedFP8Hugging Face
Ministral 3 8B Reasoning 2512Reasoning capableBF16Hugging Face
Ministral 3 14B Base 2512Base pre-trainedBF16Hugging Face
Ministral 3 14B Instruct 2512Instruct post-trainedFP8Hugging Face
Ministral 3 14B Reasoning 2512Reasoning capableBF16Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

ModelAIME25AIME24GPQA DiamondLiveCodeBench
Ministral 3 14B<u>0.850</u><u>0.898</u><u>0.712</u><u>0.646</u>
Qwen3-14B (Thinking)0.7370.8370.6630.593
Ministral 3 8B0.787<u>0.860</u>0.668<u>0.616</u>
Qwen3-VL-8B-Thinking<u>0.798</u><u>0.860</u><u>0.671</u>0.580
Ministral 3 3B<u>0.721</u><u>0.775</u>0.534<u>0.548</u>
Qwen3-VL-4B-Thinking0.6970.729<u>0.601</u>0.513

Instruct

ModelArena HardWildBenchMATH Maj@1MM MTBench
Ministral 3 14B<u>0.551</u><u>68.5</u><u>0.904</u><u>8.49</u>
Qwen3 14B (Non-Thinking)0.42765.10.870NOT MULTIMODAL
Gemma3-12B-Instruct0.43663.20.8546.70
Ministral 3 8B0.509<u>66.8</u>0.876<u>8.08</u>
Qwen3-VL-8B-Instruct<u>0.528</u>66.3<u>0.946</u>8.00
Ministral 3 3B0.305<u>56.8</u>0.8307.83
Qwen3-VL-4B-Instruct<u>0.438</u><u>56.8</u><u>0.900</u><u>8.01</u>
Qwen3-VL-2B-Instruct0.16342.20.7866.36
Gemma3-4B-Instruct0.31849.10.7595.23

Base

...

Quantizations & VRAM

Q4_K_M4.5 bpw
8.4 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
11.9 GB
VRAM required
97%
Quality
Q8_08 bpw
14.5 GB
VRAM required
100%
Quality
FP1616 bpw
28.5 GB
VRAM required
100%
Quality

Benchmarks (13)

IFEval70.3
GPQA Diamond57.2
BBH36.1
LiveCodeBench35.1
AIME30.0
AA Math30.0
MMLU-PRO28.7
MUSR18.4
AA Intelligence16.0
AA Coding10.9
MATH8.5
GPQA7.3
HLE4.6

Run with Ollama

$ollama run ministral-3

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

Find the best GPU for Ministral 3 14B

Build Hardware for Ministral 3 14B