
Share
AI tools are cracking math puzzles in the Erdős database, but researchers find many solutions aren't new-just rediscovered after years of academic work, raising questions about AI's true problem-solving prowess.
In recent weeks, there have been several instances where AI tools autonomously solved problems from the Erdős problem database, only to discover that these problems had already been addressed in academic literature years ago. This phenomenon has sparked a debate among mathematicians and AI researchers about the capabilities and limitations of current AI systems.
The Erdős problem database is a collection of open mathematical problems posed by the prolific mathematician Paul Erdős. These problems span various areas of mathematics, from number theory to combinatorics, and are known for their simplicity in statement but often complexity in solution. The database serves as a benchmark for both human and machine solvers.
Terence Tao, a renowned mathematician and Fields Medalist, highlighted several cases where AI tools successfully solved Erdős problems:
These solutions were generated by AI systems with minimal human intervention, showcasing the growing capabilities of machine learning in mathematical problem-solving.
One possible explanation for these coincidences is "contamination." This theory suggests that the training data used to develop these AI tools might have inadvertently included the solutions to these problems. If the AI models were trained on a dataset that contained papers or discussions about these solutions, it could explain why they arrived at correct answers.
However, this theory has its limitations. Other deep research tools failed to pick up on these connections, indicating that contamination alone is unlikely to be the full explanation.

Tao proposes an alternative theory: AI tools are now becoming capable of solving "low-hanging fruit" problems in the Erdős database. By "low-hanging fruit," he means problems that can be solved using fairly standard techniques and simple proofs. These types of problems, while listed as open, are often the ones most likely to have been solved in obscure parts of the literature without much fanfare.
This correlation between AI-solvability and prior solutions in the literature suggests that current AI tools are effective at identifying and solving straightforward mathematical problems. However, this also means that they might be overlooking more complex or novel challenges that require deeper insight and creativity.
Tao notes that this trend is likely to continue in the near term, especially for problems tackled purely by AI without significant human oversight. Nevertheless, the progress made by these tools is non-trivial and indicates a promising future for automated problem-solving in mathematics.
The ability of AI systems to automatically scan through the "long tail" of underexamined problems in the mathematical literature is a significant step forward. This capability could help mathematicians identify and solve problems that might have been overlooked or deemed too trivial for publication.
However, it also highlights the need for better integration of human expertise with AI tools. Combining the strengths of both humans and machines could lead to more robust and innovative solutions in mathematics and other fields.
The recent successes of AI in solving Erdős problems are a testament to the rapid advancements in machine learning and artificial intelligence. While these achievements are impressive, they also raise important questions about the nature of problem-solving and the role of human oversight in AI research.
As AI tools continue to evolve, it will be crucial to strike a balance between automation and collaboration, ensuring that both humans and machines contribute to the advancement of mathematical knowledge.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
2 January 2026
88 articles
Related Articles
Related Articles
More Stories