In May 2025, a meeting as discreet as it was extraordinary took place in Berkeley, California. Thirty of the world’s most eminent mathematicians, from diverse backgrounds, gathered for an unprecedented challenge: to test cutting-edge artificial intelligence, OpenAI’s o4-mini model, by submitting it to mathematical problems of formidable complexity.
This event, organized by Epoch AI, not only revealed the impressive capabilities of AI but also raised profound questions about the future of human mathematics.
A Challenge Designed to Push the Limits of AI
The objective of this meeting was clear: to design mathematical problems that AI could not solve. Each question that o4-mini could not solve earned its creator a $7,500 prize. This project, called FrontierMath, began in September 2024 under the direction of Elliot Glazer, a young mathematics PhD. The team had already established a series of questions divided into several levels of difficulty: university-level, doctoral, and research problems. However, the preliminary results surprised everyone: o4-mini managed to solve approximately 20% of the questions, even those designed to be extremely challenging.
To accelerate the process, Epoch AI organized this in-person meeting on May 17-18, 2025. The participants, divided into groups of six, worked tirelessly for two days to design problems capable of outwitting the AI. Among them, Ken Ono, a mathematician at the University of Virginia, proposed an open question in number theory, a problem that a seasoned doctoral student would have found difficult. To everyone’s astonishment, o4-mini solved this problem in minutes, a task that would have taken a human mathematician weeks or even months.
« I have colleagues who have literally said that these models approach mathematical genius », says Ken Ono.
An AI That Is Faster and More Confident, But Not Infallible
One of the most striking aspects of this experience is the speed of the AI. Where a human mathematician might spend weeks developing a proof, o4-mini provides answers in minutes. However, this speed comes with a drawback: sometimes excessive confidence. Yang-Hui He, another participant, described this attitude as “proof by intimidation”: the AI presents its results with such assurance that it is easy to accept them without rigorous verification.
Despite its prowess, o4-mini is not infallible. The mathematicians succeeded in identifying ten questions that the AI could not solve, each rewarded with the promised prize. These successes, though limited, show that human intelligence retains an advantage in designing original and complex problems, at least for now.
Toward a Future Where AI Redefines Mathematics
This meeting also opened a discussion about the future of the mathematics profession. If AI can solve problems at the doctoral level in minutes, what role will remain for humans? Participants discussed the idea of a “fifth level” of difficulty: questions that even the best mathematicians cannot solve. If AI reaches this stage, the role of mathematicians could transform radically, shifting from problem-solving to formulating ever bolder questions.
However, this revolution is not without precedent. As David Silver of Google DeepMind points out, the invention of the calculator freed mathematicians from tedious calculations, allowing them to focus on more abstract concepts. Similarly, AI tools like o4-mini could become “co-pilots” for mathematicians, amplifying their abilities rather than replacing them.
« In the future, we will see complex proofs solved automatically by AI, just as the calculator simplified calculations », explains David Silver.
A Collaboration Between Humans and Machines
The Berkeley event is just one example of how AI is transforming mathematics. Tools like Lean, an automated proof assistant, already allow demonstrations to be verified with unparalleled precision, reducing human error. Moreover, programs like AlphaProof and AlphaGeometry 2 from Google DeepMind have shown promising results, achieving a silver medal level at the International Mathematical Olympiad in 2024.
These advances suggest a future where collaboration between humans and machines becomes the norm. Mathematicians could focus on creativity and intuition, while AI handles intensive calculations and verification. This synergy could pave the way for mathematical discoveries of unprecedented scale.
Lessons From a Secret Meeting
The Berkeley meeting, though dubbed “secret” to preserve the integrity of the questions posed, had a resounding echo in the scientific community. It demonstrated that AI is far more than just a calculation tool: it is capable of rivaling the brightest minds in fields as abstract as mathematics. However, it also reminded us of the importance of human ingenuity in pushing the boundaries of what is possible.
This event is an invitation to reflect on how AI is redefining the frontiers of knowledge. Mathematicians, far from becoming obsolete, could become the architects of a new paradigm, where AI amplifies their genius. As Ken Ono pointed out, AI is not yet ready to replace mathematicians, but it is pushing them to redefine their role in a rapidly evolving world.
Sources
Scientific American, « Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI »
Scientific American, « AI Will Become Mathematicians’ ‘Co-Pilot’ »
Scientific American, « AI Reaches Silver-Medal Level at This Year’s Math Olympiad »
Reddit, r/artificial, « Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI »
