Microsoft/Dense

WizardCoder Python 34B

coding
34B
Parameters
16K
Context length
2
Benchmarks
4
Quantizations
60K
HF downloads
Architecture
Dense
Released
2023-08-26
Layers
48
KV Heads
8
Head Dim
128
Family
wizardcoder

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

<p style="font-size:28px;" align="center"> 🏠 <a href="https://wizardlm.github.io/" target="_blank">Home Page</a> </p> <p align="center"> <p align="center"> 🤗 <a href="https://huggingface.co/WizardLM" target="_blank">HF Repo</a> •🐱 <a href="https://github.com/nlpxucan/WizardLM" target="_blank">Github Repo</a> • 🐦 <a href="https://twitter.com/WizardLM_AI" target="_blank">Twitter</a> </p> <p align="center"> 📃 <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> • 📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> • 📃 <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a> <br> </p> <p align="center"> 👋 Join our <a href="https://discord.gg/VZjjHtWrKs" target="_blank">Discord</a> </p>

News

[2024/01/04] 🔥 We released WizardCoder-33B-V1.1 trained from deepseek-coder-33b-base, the SOTA OSS Code LLM on EvalPlus Leaderboard, achieves 79.9 pass@1 on HumanEval, 73.2 pass@1 on HumanEval-Plus, 78.9 pass@1 on MBPP, and 66.9 pass@1 on MBPP-Plus.

[2024/01/04] 🔥 WizardCoder-33B-V1.1 outperforms ChatGPT 3.5, Gemini Pro, and DeepSeek-Coder-33B-instruct on HumanEval and HumanEval-Plus pass@1.

[2024/01/04] 🔥 WizardCoder-33B-V1.1 is comparable with ChatGPT 3.5, and surpasses Gemini Pro on MBPP and MBPP-Plus pass@1.

ModelCheckpointPaperHumanEvalHumanEval+MBPPMBPP+License
GPT-4-Turbo (Nov 2023)--85.481.783.070.7-
GPT-4 (May 2023)--88.476.8---
GPT-3.5-Turbo (Nov 2023)--72.665.981.769.4-
Gemini Pro--63.455.572.957.9-
DeepSeek-Coder-33B-instruct--78.772.678.766.7-
WizardCoder-33B-V1.1🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-33B-V1.1" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>79.973.278.966.9<a href="https://huggingface.co/WizardLM/WizardMath-7B-V1.1/resolve/main/LICENSE" target="_blank">MSFTResearch</a>
WizardCoder-Python-34B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>73.264.673.259.9<a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a>
WizardCoder-15B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-15B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>59.852.4----<a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a>
WizardCoder-Python-13B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>64.0------<a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a>
WizardCoder-Python-7B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-Python-7B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>55.5------<a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a>
WizardCoder-3B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-3B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>34.8------<a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a>
WizardCoder-1B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardCoder-1B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a>23.8------<a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a>
  • Our WizardMath-70B-V1.0 model slightly outperforms some closed-source LLMs on the GSM8K, including ChatGPT 3.5, Claude Instant 1 and PaLM 2 540B.
  • Our WizardMath-70B-V1.0 model achieves 81.6 pass@1 on the GSM8k Benchmarks, which is 24.8 points higher than the SOTA open-source LLM, and achieves 22.7 pass@1 on the MATH Benchmarks, which is 9.2 points higher than the SOTA open-source LLM.
<font size=4>
ModelCheckpointPaperGSM8kMATHOnline DemoLicense
WizardMath-70B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardMath-70B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>81.622.7Demo<a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a>
WizardMath-13B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardMath-13B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>63.914.0Demo<a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a>
WizardMath-7B-V1.0🤗 <a href="https://huggingface.co/WizardLM/WizardMath-7B-V1.0" target="_blank">HF Link</a>📃 <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>54.910.7Demo <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a>
</font> <font size=4>
<sup>Model</sup><sup>Checkpoint</sup><sup>Paper</sup><sup>MT-Bench</sup><sup>AlpacaEval</sup><sup>GSM8k</sup><sup>HumanEval</sup><sup>License</sup>
<sup>WizardLM-70B-V1.0</sup><sup>🤗 <a href="https://huggingface.co/WizardLM/WizardLM-70B-V1.0" target="_blank">HF Link</a> </sup><sup>📃Coming Soon</sup><sup>7.78</sup><sup>92.91%</sup><sup>77.6%</sup><sup> 50.6</sup><sup> <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 License </a></sup>
<sup>WizardLM-13B-V1.2</sup><sup>🤗 <a href="https://huggingface.co/WizardLM/WizardLM-13B-V1.2" target="_blank">HF Link</a> </sup><sup>7.06</sup><sup>89.17%</sup><sup>55.3%</sup><sup>36.6 </sup><sup> <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 License </a></sup>
<sup>WizardLM-13B-V1.1</sup><sup> 🤗 <a href="https://huggingface.co/WizardLM/WizardLM-13B-V1.1" target="_blank">HF Link</a> </sup><sup>6.76</sup><sup>86.32%</sup><sup>25.0 </sup><sup>Non-commercial</sup>
<sup>WizardLM-30B-V1.0</sup><sup>🤗 <a href="https://huggingface.co/WizardLM/WizardLM-30B-V1.0" target="_blank">HF Link</a></sup><sup>7.01</sup><sup>37.8 </sup><sup>Non-commercial</sup>
<sup>WizardLM-13B-V1.0</sup><sup>🤗 <a href="https://huggingface.co/WizardLM/WizardLM-13B-V1.0" target="_blank">HF Link</a> </sup><sup>6.35</sup><sup>75.31%</sup><sup> 24.0 </sup><sup>Non-commercial</sup>
<sup>WizardLM-7B-V1.0 </sup><sup>🤗 <a href="https://huggingface.co/WizardLM/WizardLM-7B-V1.0" target="_blank">HF Link</a> </sup><sup> 📃 <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> </sup><sup>19.1 </sup><sup> Non-commercial</sup>
</font>

Comparing WizardCoder-Python-34B-V1.0 with Other LLMs.

🔥 The following figure shows that our WizardCoder-Python-34B-V1.0 attains the second position in this benchmark, surpassing GPT4 (2023/03/15, 73.2 vs. 67.0), ChatGPT-3.5 (73.2 vs. 72.5) and Claude2 (73.2 vs. 71.2).

<p align="center" width="100%"> <a ><img src="https://raw.githubusercontent.com/nlpxucan/WizardLM/main/WizardCoder/imgs/compare_sota.png" alt="WizardCoder" style="width: 96%; min-width: 300px; display: block; margin: auto;"></a> </p>

Prompt Format

"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

Inference Demo Script

We provide the inference demo code here.

Citation

Please cite the repo if you use the data, method or code in this repo.

@article{luo2023wizardcoder,
  title={WizardCoder: Empowering Code Large Language Models with Evol-Instruct},
  author={Luo, Ziyang and Xu, Can and Zhao, Pu and Sun, Qingfeng and Geng, Xiubo and Hu, Wenxiang and Tao, Chongyang and Ma, Jing and Lin, Qingwei and Jiang, Daxin},
  journal={arXiv preprint arXiv:2306.08568},
  year={2023}
}

Quantizations & VRAM

Q4_K_M4.5 bpw
19.9 GB
VRAM required
94%
Quality
Q6_K6.5 bpw
28.4 GB
VRAM required
97%
Quality
Q8_08 bpw
34.8 GB
VRAM required
100%
Quality
FP1616 bpw
68.8 GB
VRAM required
100%
Quality

Benchmarks (2)

HumanEval73.2
MBPP73.1

GPUs that can run this model

At Q4_K_M quantization. Sorted by minimum VRAM.

AMD RX 7900 XT
20 GB VRAM • 800 GB/s
AMD
$849
NVIDIA RTX 4000 Ada 20GB
20 GB VRAM • 432 GB/s
NVIDIA
$1250
NVIDIA A10M
20 GB VRAM • 500 GB/s
NVIDIA
NVIDIA GeForce RTX 3080 Ti 20 GB
20 GB VRAM • 760 GB/s
NVIDIA
$1199
AMD Radeon RX 7900 XT
20 GB VRAM • 800 GB/s
AMD
$899
NVIDIA RTX 4000 Ada Generation
20 GB VRAM • 360 GB/s
NVIDIA
NVIDIA RTX 4000 SFF Ada Generation
20 GB VRAM • 280 GB/s
NVIDIA
NVIDIA RTX A4500
20 GB VRAM • 640 GB/s
NVIDIA
NVIDIA RTX 4090
24 GB VRAM • 1008 GB/s
NVIDIA
$1599
NVIDIA RTX 3090 Ti
24 GB VRAM • 1008 GB/s
NVIDIA
$999
NVIDIA RTX 3090
24 GB VRAM • 936 GB/s
NVIDIA
$850
AMD RX 7900 XTX
24 GB VRAM • 960 GB/s
AMD
$999
Apple M4 Pro (24GB)
24 GB VRAM • 273 GB/s
APPLE
$1399
NVIDIA L4 24GB
24 GB VRAM • 300 GB/s
NVIDIA
$2500
NVIDIA A10 24GB
24 GB VRAM • 600 GB/s
NVIDIA
$3500
Apple M2 (24GB)
24 GB VRAM • 100 GB/s
APPLE
$999
Apple M3 (24GB)
24 GB VRAM • 100 GB/s
APPLE
$999
Apple M4 (24GB)
24 GB VRAM • 120 GB/s
APPLE
$699
NVIDIA Tesla M40 24 GB
24 GB VRAM • 288 GB/s
NVIDIA
NVIDIA Tesla P10
24 GB VRAM • 694 GB/s
NVIDIA
NVIDIA Tesla P40
24 GB VRAM • 347 GB/s
NVIDIA
NVIDIA Quadro RTX 6000
24 GB VRAM • 672 GB/s
NVIDIA
NVIDIA Quadro RTX 6000 Passive
24 GB VRAM • 624 GB/s
NVIDIA
NVIDIA GeForce RTX 3090
24 GB VRAM • 936 GB/s
NVIDIA
$1499
NVIDIA A10 PCIe
24 GB VRAM • 600 GB/s
NVIDIA
NVIDIA A10G
24 GB VRAM • 600 GB/s
NVIDIA
NVIDIA RTX A5000
24 GB VRAM • 768 GB/s
NVIDIA
NVIDIA GeForce RTX 3090 Ti
24 GB VRAM • 1010 GB/s
NVIDIA
$1999
NVIDIA GeForce RTX 4090
24 GB VRAM • 1010 GB/s
NVIDIA
$1599
NVIDIA L40 CNX
24 GB VRAM • 864 GB/s
NVIDIA

Find the best GPU for WizardCoder Python 34B

Build Hardware for WizardCoder Python 34B