AlphaProof Nexus Cracks Decades-Old Math Problems for a Fraction of the Cost

Tools & Engineering

The Engineer

8 Jun 2026 · 3 min read

Google DeepMind's AlphaProof Nexus leverages language models and machine verification to solve complex mathematical problems, proving that AI can tackle long-standing challenges at a fraction of traditional costs.

Google DeepMind’s AlphaProof Nexus has made significant strides in solving some of the most challenging open problems in mathematics. The system autonomously solved nine out of 353 Erdős problems it attempted, including two questions that had stumped mathematicians for 56 years. It also proved 44 out of 492 conjectures from the Online Encyclopedia of Integer Sequences (OEIS), settled a 15-year-old question about Hilbert functions in algebraic geometry, and improved a known bound in convex optimization. Remarkably, inference costs ran just a few hundred dollars per problem.

How AlphaProof Nexus Works

Unlike pure natural-language approaches, such as OpenAI's recent solution, AlphaProof Nexus uses a hybrid model where the language model generates proof steps in Lean’s formal language, and the compiler checks each step for correctness. This process ensures that the system remains grounded by symbolic feedback, which helps offset the well-known weaknesses of language models in logical reasoning.

The system consists of four agent variants with increasing complexity:

Agent (A): The simplest variant uses independent sub-agents running on Gemini 3.1 Pro in loops. The language model generates proof steps, the Lean compiler checks them, and error messages feed back into the next attempt.
Agent (B): Adds queries to AlphaProof, a reinforcement-learning-based system for olympiad math, which can fill in missing proof segments.
Agent (C): Introduces an evolutionary component. Sub-agents share a common population of proof sketches. Rating agents built on Gemini 3.0 Flash score these sketches for plausibility and novelty, then rank them using an Elo system.
Agent (D): Combines all the capabilities of Agents A, B, and C.

Agent (D) was used to tackle the Erdős problems. Post-hoc analysis revealed that even simpler agents could solve some complex problems, suggesting a robustness in the overall framework.

In Practice

The practical implications of AlphaProof Nexus are profound for both mathematicians and software developers. For mathematicians, it provides a powerful tool to explore and verify conjectures, potentially accelerating the pace of mathematical research. For software developers, especially those working with formal verification and automated reasoning, this system offers a new approach to ensuring correctness in code.

Brian Holt, a Program Manager at Microsoft, emphasized the potential of AI tools like AlphaProof Nexus in his recent talk on AI-enabled engineers. "These tools can serve as helpful assistants in data analysis, design, documentation, coding, inspection, manufacturing, and much more," he said. "They not only enhance productivity but also help ensure that the solutions we develop are logically sound and reliable."

The cost-effectiveness of AlphaProof Nexus is another significant factor. Running inference costs just a few hundred dollars per problem, making it accessible to researchers and developers who might otherwise be constrained by budget limitations. This affordability could democratize access to advanced AI tools, enabling more widespread innovation.

AlphaProof Nexus represents a significant leap forward in the intersection of AI and mathematics. By combining powerful language models with machine verification, it offers a robust solution to long-standing problems at a fraction of the traditional cost. As this technology continues to evolve, its impact on both mathematical research and software development is likely to grow even more pronounced.