The application of artificial intelligence in the field of mathematics has made another breakthrough. Google DeepMind's latest AI system, AlphaGeometry2, performed well in the International Mathematics Olympiad and successfully surpassed the average level of gold medal players. This achievement not only demonstrates the potential of AI in solving complex geometric problems, but also provides new ideas for the future development of general AI models.
Recently, an AI system developed by Google DeepMind, AlphaGeometry2, successfully surpassed the average level of the International Mathematics Olympics (IMO) gold medalists and performed well in geometry questions. AlphaGeometry2 is an upgraded version of the AlphaGeometry system released by DeepMind last year. In its latest research, the research team pointed out that the system can solve 84% of the geometric problems of IMO over the past 25 years.
So why would DeepMind focus on such a high school math competition? Researchers believe that new ways to solve complex geometric problems may be the key to improving AI capabilities, especially in Euclidian geometry. Proving mathematical theorems requires reasoning skills and the ability to choose appropriate solutions, and DeepMind believes that these problem-solving abilities may be crucial to the future development of general AI models.
This summer, DeepMind also showed a system that combines AlphaGeometry2 with AlphaProof, an AI model for formal mathematical reasoning, which solved four of the six questions in the 2024 IMO qualifiers. Apart from geometric problems, this approach may also extend to other areas of mathematics and science, and can even help with complex engineering calculations.
The core of AlphaGeometry2 includes a language model from the Google Gemini family and a "symbol engine". The Gemini model helps the symbolic engine derive solutions to the problem using mathematical rules. The workflow is: the Gemini model predicts which constructs (such as points, lines, circles) may be helpful in solving problems, and the symbolic engine then performs logical reasoning based on these constructs. After a series of complex searches, AlphaGeometry2 was able to combine the Gemini model's suggestions with known principles to draw proofs.
Although AlphaGeometry2 successfully answered 42 of the 50 IMO problems, surpassing the average score of gold medal players, there are still some limitations, such as the inability to solve the uncertain number of variables, nonlinear equations and inequality. In addition, on some more difficult questions, AlphaGeometry2's performance was not ideal, and only 20 of the 29 questions were solved.
This study once again sparked discussions about whether AI systems should be based on symbolic operations or more brain-like neural networks. AlphaGeometry2 uses a hybrid approach that combines neural networks and rules-based symbolic engines. DeepMind's team notes that while large language models may generate partial solutions without external tools, symbolic engines are still important tools in mathematical applications in the current situation.
The success of AlphaGeometry2 marks a further breakthrough in AI in the field of mathematics, and may play a role in more complex problems in the future.