
Share
Cohen and Halpern's technique breaks down complex interactions into simpler tasks, enabling small models to accurately predict user intent, potentially revolutionizing on-device AI assistance.
January 22, 2026
Danielle Cohen and Yoni Halpern, Software Engineers at Google, have introduced a novel approach to understanding user intents from UI interaction trajectories using small models. This method outperforms significantly larger models, making it an exciting development for on-device applications.
As AI technologies advance, the goal is to create agents that can better anticipate and assist with user needs. For mobile devices, this means understanding what users are doing or trying to do when they interact with apps. This context helps predict potential next actions, enhancing user experience. For instance, if a user has been searching for music festivals in Europe and then looks for flights to London, an intelligent agent could suggest festivals in London on the specific dates of interest.
Large multimodal language models (LLMs) are already adept at understanding user intent from UI trajectories. However, using LLMs for this task often involves sending data to a server, which can be slow, costly, and may expose sensitive information. This is where small models come in-lightweight, efficient, and capable of running on-device.
In their paper "Small Models, Big Results: Achieving Superior Intent Extraction Through Decomposition," presented at EMNLP 2025, Cohen and Halpern propose a two-stage approach to make user intent understanding more tractable for small models:
Stage 1: Summarize Each Screen Separately
Stage 2: Extract Intent from Summaries

The researchers formalized metrics to evaluate model performance, ensuring that the approach could be rigorously tested. Here are some key implementation details:
Model Architecture:
Training Data:
Benchmarks:
This approach has significant implications for on-device applications, particularly in mobile devices and wearables where resource constraints are common. By enabling more efficient and private user intent understanding, it can enhance the user experience across a range of applications, from personal assistants to e-commerce platforms.
The researchers suggest several avenues for future work, including:
The decomposition approach presented by Cohen and Halpern offers a promising solution for on-device user intent understanding. By breaking down the task into manageable stages, small models can achieve results that rival those of much larger models, making them an attractive option for resource-constrained environments.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
23 January 2026
88 articles
Related Articles
Related Articles
More Stories