Google DeepMind’s AI systems show high Math aptitude, solve complex problems

BY

Published 30 Jul 2024

NSFW AI Why trust Greenbot

We maintain a strict editorial policy dedicated to factual accuracy, relevance, and impartiality. Our content is written and edited by top industry professionals with first-hand experience. The content undergoes thorough review by experienced editors to guarantee and adherence to the highest standards of reporting and publishing.

Disclosure

Free Woman Writing On A Whiteboard Stock Photo

Google DeepMind’s artificial intelligence (AI) systems, AlphaProof and AlphaGeometry 2, showcased their advanced logical reasoning capability after solving four out of six complex math problems from this year’s International Mathematical Olympiad (IMO).

This accomplishment placed the latest systems at an 83% success rate, the highest ever in the history of AI and machine learning. Previously, solving math problems in step-by-step proofs was one of the Achilles heels of automated models due to their limited reasoning skills.

“This is great progress in the field of machine learning and AI. No such system has been developed until now which could solve problems at this success rate with this level of generality,” said Pushmeet Kohli, vice president of research at Google DeepMind.

How the AI Systems Work

Researchers at the Google research lab achieved this feat by developing two specialized systems that review their answers in a formal language that is easier for the AI to understand.

To do that, they fine-tuned Google’s Gemini to translate one million mathematical problems—which differed in difficulty—from English to the programming language Lean. This produced a large dataset of formal math problems that were then fed to AlphaProof.

AlphaProof functions are based on reinforcement learning, wherein it trains itself to prove mathematical solutions without human help by trial and error. The more problems it has correctly solved, the more successful it has become at answering advanced Mathematics.

Meanwhile, in training AlphaGeometry 2, the researchers turned to AI-generated math problems to create a sufficient dataset, considering that real data for math-focused AI models is limited.

This resulted in a more optimized system capable of answering challenging geometry problems, along with questions involving combinatorics, ratios, and distances.

Silver at IMO

The capabilities of both AI systems were tested using six problems from the IMO 2024.

AlphaProof was able to solve two algebra problems and one number theory problem. While it took only a few minutes to answer one of the questions, the others were calculated after three days.

The 609 high school students who participated in this year’s Math Olympiad were given a total of 9 hours to submit their answers.

On the other hand, AlphaGeometry 2 succeeded in solving a question on geometry in 19 seconds. However, it left the remaining two combinatorics problems unsolved.

“Generally, AlphaProof performs much better on algebra and number theory than combinatorics. We are still working to understand why this is, which will hopefully lead us to improve the system,” stated Alex Davies, a research engineer who worked for AlphaProof.

This performance earned the two systems a total score of 28 points out of a maximum of 42, which would have landed them a silver medal based on IMO guidelines.

One more point, and they would have secured the gold, along with 58 human contestants who scored at least 29 points in the math competition.

No new knowledge

Although experts who judged the AI’s work were impressed by how the systems came up with clever ideas for solving the problems, they acknowledged that more work is still needed to analyze how they accomplished it.

David Silver, Google DeepMind’s vice president of reinforcement learning, also admitted that AlphaProof and AlphaGeometry 2 were not adding new knowledge to what was already known and created by humans.

“We’re at the point where these systems can actually prove not open research problems but at least problems that are very challenging to the very best young mathematicians in the world,” he added.