
Share
Stability AI and Arm unveil Stable Audio Open Small, a lightweight text-to-audio model for smartphones, offering quick audio generation with just 341 million parameters-perfect for developers seeking efficient mobile solutions.
May 14, 2025
Stability AI and Arm have announced the release of Stable Audio Open Small, a compact text-to-audio model designed to run entirely on Arm CPUs. This new model is optimized for generating short audio samples quickly and efficiently on mobile devices. Here’s what changed technically and why it matters to developers and practitioners.
Model Size and Performance: Stable Audio Open Small has 341 million parameters, making it significantly smaller than its predecessor while maintaining high output quality and prompt adherence. It can generate up to 11 seconds of audio on a smartphone in less than 8 seconds.
Optimization for Arm CPUs: The model leverages Arm's KleidiAI software stack, which is designed to optimize AI workloads on Arm processors. This ensures that the model runs efficiently on a wide range of mobile devices, from high-end smartphones to budget models.
Real-World Deployment: By running entirely on-device, Stable Audio Open Small enables real-time audio generation without relying on cloud services. This is particularly useful for applications requiring low latency and data privacy, such as voice assistants, gaming, and interactive storytelling.
On-Device Capabilities:
Developer Accessibility:

Architecture: The model uses a combination of text encoding and audio synthesis techniques to generate audio from text inputs. The architecture is optimized to run efficiently on Arm CPUs, ensuring that the computational requirements are minimal.
Implementation: Developers can download the model weights from Hugging Face and access the code on GitHub. The provided resources include detailed documentation and sample implementations to help developers get started.
The release of Stable Audio Open Small represents a significant step forward in on-device text-to-audio generation. By combining Arm's hardware optimization with Stability AI's cutting-edge model, developers can now create powerful and efficient audio applications for mobile devices. Whether you’re building a voice assistant, enhancing a game, or creating an interactive story, this model offers the performance and flexibility needed to bring your ideas to life.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
19 May 2025
88 articles
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
More Stories