
Share
Qwen1.5-32B addresses the open-source community's need for a model that excels in performance while maintaining efficiency and affordability, striking a balance between resource-intensive giants like Qwen1.5-72B and lighter alternatives.
The open-source community has been on a quest to find a model that balances performance, efficiency, and memory footprint. While models like Qwen1.5-72B and DBRX have pushed the boundaries of what's possible, they often come with significant drawbacks such as high memory consumption, slow inference speed, and expensive finetuning costs.
In response to this challenge, the Qwen team is excited to introduce the latest additions to the Qwen1.5 language model series: Qwen1.5-32B and Qwen1.5-32B-Chat. These models aim to hit the "sweet spot" of around 30 billion parameters, offering strong performance while keeping resource requirements manageable.
Qwen1.5-32B is built on the same architecture as its predecessors but with a few key enhancements:
Qwen1.5-32B has been rigorously tested against other state-of-the-art (SOTA) models with similar parameter counts. Here’s how it stacks up:

| Model | MMLU | C-Eval | GSM8K | MATH | HumanEval | MBPP | BBH | CMMLU | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Llama2-34B | 62.6 | - | 42.2 | 6.2 | 22.6 | 33.0 | 44.1 | - | | Yi-34B | 76.3 | 81.4 | 67.2 | 14.4 | 23.2 | 41.0 | 54.3 | 83.7 | | Mixtral-8x7B | 70.6 | - | 74.4 | 28.4 | 40.2 | 60.7 | - | - | | Qwen1.5-72B | 77.5 | 84.1 | 79.5 | 34.1 | 41.5 | 53.4 | 65.5 | 83.5 | | Qwen1.5-32B | 73.4 | 83.5 | 77.4 | 36.1 | 37.2 | 49.4 | 66.8 | 82.3 |
To enhance the conversational capabilities of Qwen1.5-32B, we have focused on post-training techniques:
Qwen1.5-32B is fully open-source, allowing researchers and developers to experiment with and contribute to its development. The model is available on multiple platforms:
Tags
Original Sources
↗ https://qwenlm.github.io/blog/qwen1.5-32b/?utm_source=tldrai
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
8 April 2024
133 articles
Related Articles
Related Articles
More Stories