
Share
Stability AI challenges the notion that bigger is better in language models with its efficient 1.6B parameter Stable LM 2, which outperforms larger counterparts and supports seven languages, democratizing access to powerful generative tools.
Stability AI, the company known for its groundbreaking Stable Diffusion text-to-image model, has released a new, more compact language model called Stable LM 2 1.6B. This smaller model aims to lower barriers for developers and enhance the generative AI ecosystem by supporting multilingual data in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch.
The key technical advancement here is the reduction in size while maintaining or even improving performance. The new Stable LM 2 1.6B model has just 1.6 billion parameters, making it one of the smallest models from Stability AI. Despite its compact size, this model outperforms other small language models with under 2 billion parameters on various benchmarks.
Carlos Riquelme, Head of the Language Team at Stability AI, explained that while larger models generally perform better when trained on similar data with similar recipes, recent advancements in algorithms and higher-quality training data can sometimes lead to smaller models outperforming their larger predecessors.

The new model is built on the foundation of Stability AI's earlier Stable LM releases but incorporates several optimizations:
While the smaller size brings significant advantages, it also comes with some trade-offs:
The release of Stable LM 2 1.6B marks another significant step in the evolution of language models. By combining compact size with strong performance, Stability AI aims to make advanced text generation more accessible to a broader audience of developers and researchers. The model's open-source nature further encourages collaboration and innovation in the AI community.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
23 January 2024
133 articles
Related Articles
Related Articles
More Stories