
Share
Alibaba's compact Qwen3.5 models defy size limitations, outstripping OpenAI’s larger GPT-OSS with superior multimodal skills and efficiency for edge devices, redefining AI performance standards.
Alibaba’s Qwen Team, a leading AI research group within the e-commerce giant, has unveiled its latest batch of open-source models, the Qwen3.5 Small Model Series. These new additions are designed to be lightweight and efficient, making them ideal for edge devices where battery life and performance are critical. Despite their smaller size, these models boast impressive capabilities, particularly in multimodal tasks and reasoning, outperforming larger models like OpenAI’s GPT-OSS-120B.
The Qwen3.5 Small Model Series includes:
The technical underpinnings of the Qwen3.5 small models represent a significant departure from traditional Transformer architectures. Alibaba’s researchers have developed an Efficient Hybrid Architecture that combines Gated Delta Networks (a form of linear attention) with sparse Mixture-of-Experts (MoE).
One of the standout features of Qwen3.5 is its native multimodal capability. Unlike previous models that often "bolted on" a vision encoder to a text model, Qwen3.5 was trained using early fusion on multimodal tokens. This means that during training, both visual and textual data were processed together from the start. As a result, the 4B and 9B models exhibit superior performance in tasks involving multiple modalities, such as image captioning and visual question answering.

The weights for these models are available under the Apache 2.0 license, making them suitable for enterprise and commercial use. You can access the models on popular platforms like Hugging Face and ModelScope. This open-source approach encourages collaboration and innovation, allowing developers to customize and build upon these models as needed.
To put the performance of Qwen3.5 in perspective, consider its benchmarks:
These models are among the smallest general-purpose models recently released by any lab globally. They are comparable in size to MIT offshoot LiquidAI’s LFM2 series, which also have several hundred million or billion parameters, rather than the trillion parameters used in flagship models from OpenAI, Anthropic, and Google's Gemini series.
Alibaba’s Qwen3.5 Small Model Series represents a significant advancement in AI research, particularly for applications
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
3 March 2026
133 articles
Related Articles
Related Articles
More Stories