A team of Tsinghua alumni achieved impressive mathematical breakthroughs with the help of AI! The LeanAgent they developed, an AI assistant with lifelong learning capabilities, successfully proved 162 previously unsolved mathematical theorems and even solved the formal problem of the polynomial Freiman-Ruzsa conjecture proposed by Terence Tao. This not only demonstrates the huge potential of AI in basic scientific research, but also marks an innovation in mathematical research methods. Downcodes editors will give you an in-depth understanding of this exciting development and the technological innovation behind LeanAgent.
In the latest sensational news in the mathematics world, a group of alumni from Tsinghua University used the power of AI to successfully prove 162 mathematical theorems that no one had been able to solve before. What's even more amazing is that this agent named LeanAgent actually overcame Terence Tero's formalization problem of the polynomial Freiman-Ruzsa conjecture! This makes us lament that the research methods of basic science have been completely transformed by AI.
As we all know, although the current language models (LLM) are cool, most of them are still static and cannot learn new knowledge online. It is even more difficult to prove advanced mathematical theorems. However, LeanAgent, jointly developed by research teams from Caltech, Stanford University, and the University of Washington, is an AI assistant with lifelong learning capabilities that can continuously learn and prove theorems.
LeanAgent responds to different mathematical difficulties through carefully designed learning paths, and uses a dynamic database to manage a continuous stream of mathematical knowledge to ensure that it does not forget the skills it has already mastered when learning new knowledge. Experiments show that it successfully proved 162 mathematical theorems that no one has been able to solve before from 23 different Lean code libraries, and its performance is 11 times higher than that of traditional large models. It is really amazing!
These theorems cover many areas of higher mathematics, including difficult problems such as abstract algebra and algebraic topology. Not only is LeanAgent able to start with simple concepts and gradually tackle complex topics, it also demonstrates excellent performance in terms of stability and reverse migration.
However, Terence Tao's challenge still makes AI feel helpless. Although interactive theorem provers (ITPs) such as Lean play an important role in formalizing and verifying mathematical proofs, the process of building such proofs is often complex and time-consuming, requiring meticulous steps and extensive mathematical code libraries. Advanced large models like o1 and Claude are also prone to errors when faced with informal proofs, which highlights the shortcomings of LLM in the accuracy and reliability of mathematical proofs.
Past research has attempted to use LLM to generate complete proof steps, such as LeanDojo, which is a theorem prover created by training a large model on a specific data set. However, data for formal theorem proofs are extremely scarce, limiting the widespread applicability of this approach. Another project, ReProver, is a model optimized for the Lean theorem proving code library mathlib4. Although it covers more than 100,000 formal mathematical theorems and definitions, it is limited to the scope of undergraduate mathematics, so it does not perform well when facing more complex problems. good.
It is worth noting that the dynamic nature of mathematical research brings greater challenges to AI. Mathematicians usually work on multiple projects at the same time or alternately. For example, Terence Tao is advancing multiple research areas at the same time, including PFR conjecture and real number symmetric average. These examples show a key shortcoming of current AI theorem proving methods: the lack of an AI system that can adaptively learn and improve in different mathematical fields, especially when Lean data is limited.
That's why the team at LeanDojo created LeanAgent, a new lifelong learning framework designed to solve the above challenges. LeanAgent's workflow includes deriving theorem complexity to formulate learning courses, balancing stability and flexibility in the learning process through progressive training, and utilizing best-first tree search to find as yet unproven theorems.
LeanAgent works with any large model to improve its generalization capabilities through "retrieval". Its innovation lies in the use of a custom dynamic database to manage ever-expanding mathematical knowledge, and a course learning strategy based on Lean proof structure to help learn more complex mathematical content.
To deal with the catastrophic forgetting problem of AI, researchers adopted a progressive training method to enable LeanAgent to continuously adapt to new mathematical knowledge without forgetting previous learning. This process involves incremental training on each new data set, ensuring an optimal balance of stability and flexibility.
Through this kind of progressive training, LeanAgent has excellent performance in proving theorems, successfully proving 162 difficult problems that have not been solved by humans, especially in challenging theorems of abstract algebra and algebraic topology. Its ability to prove new theorems is 11 times higher than that of static ReProver, while maintaining its ability to prove known theorems.
LeanAgent shows the characteristics of progressive learning during the theorem proving process, gradually transitioning from simple theorems to more complex theorems, proving its depth in mastering mathematical knowledge. For example, it demonstrates a deep understanding of mathematics by proving fundamental algebraic structure theorems related to group theory and ring theory. Overall, LeanAgent brings exciting prospects to the mathematics community with its powerful continuous learning and improvement capabilities!
Paper address: https://arxiv.org/pdf/2410.06209
The emergence of LeanAgent indicates that AI will play an increasingly important role in the field of mathematical research, and there may be more cases in the future that use the power of AI to solve complex mathematical problems. This undoubtedly opens up a new direction for mathematical research, and also provides new ideas and methods for exploration in other scientific fields. Looking forward to more amazing results in the future!