Mathematics may be entering a new regime:
not AI you believe,
AI you verify.
A major Google DeepMind paper presents AlphaProof Nexus, a framework for AI-driven formal proof search in Lean.
The point is not that an LLM can write convincing mathematical prose.
That has always been the weak version of the story.
The point is that the system must produce proof code that survives a formal verifier.
LLMs generate.
Lean checks.
Search continues.
Only machine-verified proofs remain.
That changes the epistemic contract.
In informal mathematics, an AI-generated proof can look elegant while hiding a fatal gap.
In Lean, every step must compile. No rhetoric. No handwaving. No “seems plausible.”
The authors report the first large-scale evaluation of this approach on open research-level problems.
Their most capable agent autonomously resolved 9 of 353 open Erdős problems, including two questions open for 56 years, at a per-problem inference cost of a few hundred dollars.
It also proved 44 of 492 OEIS conjectures and is being deployed in combinatorics, optimization, graph theory, algebraic geometry, and quantum optics.
The architecture is fascinating.
A mathematician provides a Lean formalization.
The agent refines proof sketches.
LLM subagents propose lemmas, decompositions, constructions, and edits.
Lean rejects invalid steps.
AlphaProof can be called as a focused prover.
An evolutionary population of proof sketches is ranked and reused.
The final output is a sorry-free Lean proof.
This is not “chatbot solves math.”
It is closer to a new research instrument:
a search engine over formal proof space,
guided by generative models,
grounded by a compiler,
and audited by mathematics itself.
The deeper lesson is general:
AI systems become far more powerful when unreliable generation is wrapped in reliable verification.
For mathematics, the verifier is Lean.
For other domains, the frontier question becomes:
what is the equivalent of a compiler for truth?
Full credit to the authors:
George Tsoukalas, Anton Kovsharov, Sergey Shirobokov, Anja Surina, Moritz Firsching, Gergely Bérczi, Francisco J. R. Ruiz, Arun Suggala, Adam Zsolt Wagner, Eric Wieser, Lei Yu, Aja Huang, Miklós Z. Horváth, Andrew Ferrauiolo, Henryk Michalewski, Codrut Grosu, Thomas Hubert, Matej Balog, Pushmeet Kohli, Swarat Chaudhuri.
Paper:
Advancing Mathematics Research with AI-Driven Formal Proof Search
arxiv.org/abs/2605.22763
I’m attaching the first page because the abstract is worth reading closely.
The future of AI in mathematics may not be models we trust.
It may be agents whose work can be checked.
#AIResearch #Mathematics #FormalMethods #ArtificialIntelligence