
Share
Jacobine's study uncovers a critical flaw in AI reliability: as task lengths grow, success rates plummet exponentially, challenging current assumptions about AI capabilities and durability.
In a recent study, Matrice Jacobine builds on the empirical work of Kwa et al. (2025) to explore the performance of AI agents on tasks of varying durations. The key finding is that the success rates of these agents decline exponentially as the task length increases, following a simple mathematical model based on a constant failure rate per minute.
For practitioners and researchers, this model provides a clear framework for understanding and predicting the performance of AI agents on longer tasks. Here are the key implications:

While Jacobine's findings are promising, several questions remain open:
The discovery of an exponential decline in success rates for AI agents with task duration offers a valuable tool for practitioners. By understanding and leveraging the half-life concept, teams can better predict performance, allocate resources, and identify areas for improvement. Future research will be crucial in validating and extending these findings to broader applications.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
9 May 2025
88 articles
Related Articles
Related Articles
More Stories