
Share
Scientists at Percepta AI have developed a method to enhance language models, turning them into efficient computational tools capable of handling complex math problems with ease and precision.
Language models have shown remarkable capabilities in solving complex mathematical problems, often reaching research-grade solutions. However, they falter on simpler computational tasks that require multi-step reasoning and long context handling. Even basic operations like multiplying two numbers or solving small Sudoku puzzles are challenging without external tools.
But what if an LLM could execute these tasks as reliably and efficiently as a traditional computer? A recent breakthrough from Percepta AI demonstrates how to turn a transformer model into a computational engine capable of executing arbitrary C code, achieving millions of steps in seconds.
The key innovation lies in converting C code into tokens that the language model can process directly. This allows the model to execute programs step-by-step, generating an execution trace and streaming results at high speeds. Here’s how it works when solving a min-cost perfect matching problem using the Hungarian algorithm:
Input (10×10 Cost Matrix):
61 58 35 86 32 39 41 27 21 42
59 77 97 99 78 21 89 72 35 63
88 85 37 57 59 97 37 29 69 94
32 82 53 20 77 96 21 70 50 61
15 44 81 10 64 36 56 78 20 69
76 35 87 69 16 55 26 37 30 66
86 32 74 94 32 14 24 12 31 70
97 63 20 64 90 21 28 49 89 10
58 52 27 76 61 35 17 91 37 66
42 79 61 26 55 98 70 17 26 86
Output:
The model executes the program directly using its transformer weights, producing a readable log and token trace. It streams results at more than 30k tokens/sec on a CPU, demonstrating impressive performance.

The core technical idea is a new decoding path that optimizes attention lookups from linear scans to logarithmic time queries. This enables the model to perform millions of correct execution steps within a single run. Here are the key points:
While state-of-the-art language models excel at complex mathematical tasks, they struggle with basic computational tasks. This gap is evident in benchmarks like Sudoku-Bench, which show low unaided solve rates.
To bridge this gap, practitioners often use two approaches:
These workarounds are effective but highlight a fundamental limitation: LLMs do not reliably perform long, exact computations on their own. The ability to execute programs directly within the model itself represents a significant step forward in addressing this limitation.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
16 March 2026
133 articles
Related Articles
Related Articles
More Stories