Ministral 3 14B
Model Card
View on HuggingFaceMinistral 3 14B Instruct 2512
The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.
This model is the instruct post-trained version in FP8, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 24GB of VRAM in FP8, and less if further quantized.
Learn more in our blog post and paper.
Key Features
Ministral 3 14B consists of two main architectural components:
- 13.5B Language Model
- 0.4B Vision Encoder
The Ministral 3 14B Instruct model offers the following capabilities:
- Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
- Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
- System Prompt: Maintains strong adherence and support for system prompts.
- Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
- Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
- Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
- Large Context Window: Supports a 256k context window.
Use Cases
Private AI deployments where advanced capabilities meet practical hardware constraints:
- Private/custom chat and AI assistant deployments in constrained environments
- Advanced local agentic use cases
- Fine-tuning and specialization
- And more...
Bringing advanced AI capabilities to most environments.
Recommended Settings
We recommend deploying with the following best practices:
- System Prompt: Define a clear environment and use case, including guidance on how to effectively leverage tools in agentic systems.
- Sampling Parameters: Use a temperature below 0.1 for daily-driver and production environments ; Higher temperatures may be explored for creative use cases - developers are encouraged to experiment with alternative settings.
- Tools: Keep the set of tools well-defined and limit their number to the minimum required for the use case - Avoiding overloading the model with an excessive number of tools.
- Vision: When deploying with vision capabilities, we recommend maintaining an aspect ratio close to 1:1 (width-to-height) for images. Avoiding the use of overly thin or wide images - crop them as needed to ensure optimal performance.
Ministral 3 Family
| Model Name | Type | Precision | Link |
|---|---|---|---|
| Ministral 3 3B Base 2512 | Base pre-trained | BF16 | Hugging Face |
| Ministral 3 3B Instruct 2512 | Instruct post-trained | FP8 | Hugging Face |
| Ministral 3 3B Reasoning 2512 | Reasoning capable | BF16 | Hugging Face |
| Ministral 3 8B Base 2512 | Base pre-trained | BF16 | Hugging Face |
| Ministral 3 8B Instruct 2512 | Instruct post-trained | FP8 | Hugging Face |
| Ministral 3 8B Reasoning 2512 | Reasoning capable | BF16 | Hugging Face |
| Ministral 3 14B Base 2512 | Base pre-trained | BF16 | Hugging Face |
| Ministral 3 14B Instruct 2512 | Instruct post-trained | FP8 | Hugging Face |
| Ministral 3 14B Reasoning 2512 | Reasoning capable | BF16 | Hugging Face |
Other formats available here.
Benchmark Results
We compare Ministral 3 to similar sized models.
Reasoning
| Model | AIME25 | AIME24 | GPQA Diamond | LiveCodeBench |
|---|---|---|---|---|
| Ministral 3 14B | <u>0.850</u> | <u>0.898</u> | <u>0.712</u> | <u>0.646</u> |
| Qwen3-14B (Thinking) | 0.737 | 0.837 | 0.663 | 0.593 |
| Ministral 3 8B | 0.787 | <u>0.860</u> | 0.668 | <u>0.616</u> |
| Qwen3-VL-8B-Thinking | <u>0.798</u> | <u>0.860</u> | <u>0.671</u> | 0.580 |
| Ministral 3 3B | <u>0.721</u> | <u>0.775</u> | 0.534 | <u>0.548</u> |
| Qwen3-VL-4B-Thinking | 0.697 | 0.729 | <u>0.601</u> | 0.513 |
Instruct
| Model | Arena Hard | WildBench | MATH Maj@1 | MM MTBench |
|---|---|---|---|---|
| Ministral 3 14B | <u>0.551</u> | <u>68.5</u> | <u>0.904</u> | <u>8.49</u> |
| Qwen3 14B (Non-Thinking) | 0.427 | 65.1 | 0.870 | NOT MULTIMODAL |
| Gemma3-12B-Instruct | 0.436 | 63.2 | 0.854 | 6.70 |
| Ministral 3 8B | 0.509 | <u>66.8</u> | 0.876 | <u>8.08</u> |
| Qwen3-VL-8B-Instruct | <u>0.528</u> | 66.3 | <u>0.946</u> | 8.00 |
| Ministral 3 3B | 0.305 | <u>56.8</u> | 0.830 | 7.83 |
| Qwen3-VL-4B-Instruct | <u>0.438</u> | <u>56.8</u> | <u>0.900</u> | <u>8.01</u> |
| Qwen3-VL-2B-Instruct | 0.163 | 42.2 | 0.786 | 6.36 |
| Gemma3-4B-Instruct | 0.318 | 49.1 | 0.759 | 5.23 |
Base
...
Quantizations & VRAM
Benchmarks (13)
Run with Ollama
ollama run ministral-3GPUs that can run this model
At Q4_K_M quantization. Sorted by minimum VRAM.
Find the best GPU for Ministral 3 14B
Build Hardware for Ministral 3 14B