Google DeepMind's latest AI system, AlphaGeometry2, has made significant progress in solving geometric problems, surpassing the average gold medalist in the International Mathematics Olympics (IMO) competition. This breakthrough achievement not only demonstrates the potential of AI in the field of mathematics, but also provides a new direction for the development of general AI.
AlphaGeometry2, the latest AI system launched by Google DeepMind Research Lab, excelled in solving geometric problems, surpassing the average gold medalist in the International Mathematics Olympics (IMO) competition. The system is considered an improved version of AlphaGeometry, and researchers say AlphaGeometry2 can solve 84% of geometric problems in IMO over the past 25 years.
Why does DeepMind focus on such high school mathematics competitions? They believe that finding new ways to solve complex geometric problems, especially Euclidean geometry, may be the key to improving AI capabilities. Proving why mathematical theorems or explanation theorems (such as Pythagorean theorem) holds requires logical reasoning and the ability to select multiple possible steps. If DeepMind's theory holds true, these problem-solving capabilities will be very important for future general AI models.
This summer, DeepMind demonstrated a system that combines AlphaGeometry2 with the mathematical reasoning AI model AlphaProof, which solved four of the six problems of the 2024 IMO. In addition to geometric problems, this approach can be extended to other mathematical and scientific fields, such as complex engineering calculations.
The core components of AlphaGeometry2 include a language model from the Google Gemini series and a "symbol engine". The Gemini model helps the symbolic engine deduce feasible solutions to problems through mathematical rules. Geometric problems with IMO are usually based on figures that need to be added with "constructed", such as points, lines, or circles. AlphaGeometry2's Gemini model can predict which constructs may be helpful in solving problems.
It is worth noting that when solving the IMO problem, AlphaGeometry2 uses more than 300 million theorems and proof synthetic data generated by DeepMind itself for training. The research team selected 45 geometric problems for IMO over the past 25 years and expanded them to form a set of 50 problems. AlphaGeometry2 successfully solved 42 of them, surpassing the average score of the gold medalist.
However, AlphaGeometry2 still has some limitations, such as it cannot solve problems with variable number points, nonlinear equations, and inequality. Nevertheless, this study has sparked a discussion about whether AI systems should be based on symbolic operations or neural networks. AlphaGeometry2 adopts a hybrid approach that combines neural networks and rules-based symbolic engines.
The success of AlphaGeometry2 provides a new direction for the future development of general-purpose AI. Although not yet fully self-sufficient, research by the DeepMind team shows that more self-sufficient AI models may be available in the future.
Paper entrance: https://arxiv.org/pdf/2502.03544
Key points:
AlphaGeometry2 is able to solve 84% of geometric problems in IMO over the past 25 years, surpassing the average score of gold medalists.
The system combines neural networks and symbol engines to solve complex mathematical problems using a hybrid approach.
DeepMind hopes to promote research progress on more powerful general AI by solving geometric problems.
The success of AlphaGeometry2 not only demonstrates the potential of AI in the field of mathematics, but also provides a new direction for the development of general AI. In the future, with the continuous advancement of technology, AI will show strong capabilities in more fields.