Professor of Mathematics @ UCI. Former postdoc @ Princeton. Exploring what AI can (and can’t) do in math.

Joined October 2019
28 Photos and videos
arXiv math daily submissions have more than doubled since 2025
6
10
74
5,916
The apparent doubling was a one-day anomaly. Looking at the full final week of May, arXiv math submissions were up about 32% YoY, not 100%. My prediction (and it is already happening): we are entering the two-day paper era. Day 1: solve the problem. Day 2: verify the proof, write the paper, and submit.
1
8
836
New paper with Rupert Frank on arxiv.org/abs/2605.29035 We settle the problem of finding the sharp constant in the log Sobolev inequality on the n-cycle for all n ≥ 4, by showing that it is equal to half of the spectral gap. The question goes back to Diaconis–Saloff-Coste (1995). Even n=2k were settled by Chen–Sheu (2003); n = 5 by Chen–Liu–Saloff-Coste (2008); odd n ≤ 21 by computer-assisted proofs of Faust–Fawzi (2021). The n = 3 cycle is the exceptional case.
3
49
3,660
Academia is one of the few places where, despite all its imperfections, a person could still rise mostly through talent, curiosity, and hard work (example: Ramanujan) — not only through luck, wealth, family connections etc. But what happens if access to extremely powerful AI assistants becomes the new dividing line? The news about “superhuman AI” is scientifically exciting. But socially, it raises difficult questions — especially for gifted children from poor backgrounds who may not have access to frontier models.
40
26
226
34,275
A remarkable moment in mathematics publishing: almost the entire editorial board of the Journal of Approximation Theory has resigned simultaneously, declaring that “the journal, as we knew it, has ceased to exist.”
66
153
1,062
130,431
Grateful to share that my NSF DMS grant has been awarded. The project studies functions of many yes/no variables and their continuous analogues, with connections to high-dimensional probability, learning theory, data science, and quantum computation. The goal is to understand approximation, randomness, boundary structure, and sharp inequalities. Many thanks to NSF and DMS for their support. @NSF_MPS #DMSFunded
2
24
1,925
Déjà vu effect in LLMs I was working on a problem, call it Problem A. At some point I managed to reduce it to another problem, call it Problem B, and then solve B. Problem B is interesting in its own right and can naturally be asked in a more general setting, depending on a parameter (t). For my application to Problem A, I only needed the case (t=2), so I never seriously thought about the other values. My feeling was that, with enough time, perhaps one or two weeks, I could probably solve problem B for other (t)'s as well. Now comes the interesting part. I gave Problem B to an LLM and asked: for which values of (t) can you solve it? It produced a solution only for (t=2). The solution looked “different” from the published one: different language, different framing, no citation to my work. But after looking more carefully, I think it is essentially the same solution, just translated into another language. This is not completely trivial to recognize unless one knows the problem well. When I asked the LLM to solve Problem B for other values of (t), it could not. I have seen this phenomenon in other examples as well. Sometimes, when an LLM appears to produce a new solution to a problem, I worry that what is happening is more subtle: the problem may already have been solved somewhere, perhaps in a hidden or disguised form, and the model is showing us the same solution from a different angle. This “déjà vu effect” makes it quite hard to judge novelty in some AI-assisted mathematics.
1
1
19
2,125
New referee standard: ‘We reject the paper because the result is now routine with the help of LLMs’ 😄
14
16
261
53,416
What will mathematics look like 7 years from now? I’m really curious to hear your brief opinion.
52
8
116
239,404
Still excited about these 3Blue1Brown-style videos generated by AI. Here’s a beautiful illustration of a classic analysis problem: Let f be convex, nonnegative on [0,∞), with f(0)>0 and f(x)→0 as x→∞. Place a light source at (0,b) with 0<b<f(0). Rays reflect off the graph of f and the x-axis following the usual laws of optics. Question: can any ray escape to ∞, or do all trajectories eventually come back? Prompt to video grok.com/share/c2hhcmQtNA_2e…
23
209
6,638
2,241,460
Brownian motion in 2D by Grok 4.3 (beta) grok.com/share/c2hhcmQtNA_26…
2
3
46
7,500
I asked Grok 4.3 (beta) to generate Beamer slides (PDF) for a talk based on my paper arxiv.org/abs/2506.08494 Honestly impressed--especially with how it extracted and presented the mathematical content. I’d tweak a few details, but ~85% of the job is already done very well. Full chat, prompt and slides: grok.com/share/c2hhcmQtNA_1c…
8
41
438
48,644
We’ve just finished revising the paper to incorporate the Bellman function suggested by Grok 4.20: arxiv.org/pdf/2502.16045 I’ll say this: it’s quite rare in this field for a sharp bound (i.e., an exact Bellman function) to admit such a clean closed form. Interestingly, there are two ways to represent the solution: one via a stopping-time construction (a probabilistic approach, as Grok did), and another via an infinite series (a more analytic, Fourier-based approach). Had one pursued the latter, verifying the “two-point inequality” would have been extremely difficult.
Disclaimer: I had given early access to internal beta version of Grok 4.20 It found a new Bellman function for one of the problems I’d been working on with my student N. Alpay. The problem reduces to identifying the pointwise maximal function U(p,q) under two constraints and understanding the behavior of U(p,0). In our paper arxiv.org/pdf/2502.16045 we proved U(p,0)\geq I(p), where I(p) is the Gaussian isoperimetric profile, I(p) ~ p\sqrt{log(1/p)} as p ~ 0. After ~5 minutes, Grok 4.20 produced an explicit formula U(p,q) = E \sqrt{q^2 \tau}, where \tau is the exit time of Brownian motion from (0,1) starting at p. This yields U(p,0)=E\sqrt{\tau} ~ p log(1/p) at p ~ 0, a square root improvement in the logarithmic factor. Any significance of this result? It will not tell you how to change the world tomorrow. Rather, it gives a small step toward understanding what is going on with averages of stochastic analogs of derivatives (quadratic variation) of Boolean functions: how small can they be?  More precisely, this gives a sharp lower bound on the L1 norm of the dyadic square function applied to indicator functions 1_A of sets A \subset [0,1]. In my previous tweet about Takagi function, we saw that the sharp lower bound on ||S_1(1_A)||_1 miraculously coincides with Takagi function of |A| which (surprisingly to me) is related to the Riemann hypothesis. Here, we obtain a sharp lower bound on ||S_2(1_A)||_1 given by E \sqrt{\tau}, where Brownian motion starts at |A|. This function belongs to the family of isoperimetric-type profiles, but unlike the fractal Takagi function, it is smooth and does not coincide with the Gaussian isoperimetric profile. Finally, in harmonic analysis it is known that the square function is not bounded in L^1. The question here was more about curiosity: how exactly does it blow up when tested on Boolean functions 1_A.  Previously, the best known lower bound was |A|(1-|A|) (Burkholder—Davis—Gandy). In our paper, we obtained |A| (1-|A|)\sqrt{log(1/(|A|(1-|A|)))}. This new Grok’s Bellman function gives |A| (1-|A|) \log(1/(|A|(1-|A|))) and this bound is actually sharp.
2
4
45
7,574
UPDATES: I picked a random recent paper from the arXiv (Analysis of PDEs section) arxiv.org/abs/2604.08416 and asked LLMs to try to improve one of its important results. It started claiming that the power of ε in Theorem 6.1 (i) could be improved from min{q,r} to just r, as in part (ii). As a backup, I also picked another recent (non-random) paper arxiv.org/pdf/2603.30039 and asked for an improvement of the lower bound. There, the model claimed it could push the Grothendieck constant lower bound up to K_{DR} 2.2* 10^{-10}, which would be quite nice compared to what we currently have here github.com/teorth/optimizati… This is essentially what I managed to cover in a 45-minute experimental talk. After coming home, I spent another 2–3 hours probing these claims more carefully. My current impression: the Grothendieck lower bound improvement might actually be correct, while the ε-exponent improvement seems to have a nontrivial gap at least in certain regimes (though feel free to dig further--I may have missed something). PS: sorry I didn’t record the talk--I completely forgot to press the button. PS2: here is the write-up for the Grothendieck improvement. Enjoy! overleaf.com/read/qgthmvtdkb…
Excited (and slightly nervous?) for this very unusual experimental talk. I’ll pick a random paper from the latest arXiv announcements (analysis / probability / combinatorics), and use AI live to try to improve one of its results on the spot. Let’s see what happens 🙂
1
13
110
16,167
Excited (and slightly nervous?) for this very unusual experimental talk. I’ll pick a random paper from the latest arXiv announcements (analysis / probability / combinatorics), and use AI live to try to improve one of its results on the spot. Let’s see what happens 🙂
8
22
176
29,894
Every subgaussian is a sum of Gaussians -- Antoine Song resolves Talagrand's conjectures arxiv.org/pdf/2602.22342
6
65
404
40,438
The Johnson--Lindenstrauss lemma says something quite remarkable: if you have an astronomical number N of vectors of large size (say, in a very high-dimensional Euclidean space), then you can linearly map them into a much lower-dimensional space, of dimension about log(N), in such a way that the distances between the vectors are almost preserved. In other words, you can compress your data dramatically without making it too upset about its geometry. A random matrix with i.i.d. standard Gaussian entries will most likely do the job.
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI
36
126
1,183
207,987
The conjecture was exhaustively verified for all n ≤ 13. DeepMind then pushed it to n ≤ 16 with heavy compute, finding no counterexamples. Yet at n=17, a human insight produced a counterexample Nothing against AI, I genuinely love these tools, but this is a beautiful reminder of the unique power of human intuition.
Human insight is still a thing: over the last few years many computing resources were thrown towards the Merzon-Smirnov conjecture on maximal Schubert polynomials, including DeepMind's FunSearch. In the end it fell to a human-generated targeted check: arxiv.org/abs/2603.20104
5
25
276
32,545
Nice summary by grok "if fast-moving AI companies repeatedly drop huge (~200k LOC), minimally-reviewed code dumps into the shared ecosystem for PR/hype/funding purposes, it risks turning the careful, collaborative, volunteer-driven library into something fragmented, hard to maintain, and less welcoming to new human contributors."
After the apparently amazing announcement by @mathematics_inc on the formalization of a major recent Fields-medal winning theorem, i had no idea how pissed the math-formalization community is. Very worrying discussions by some of the leaders/founders of Lean's mathlib. cc @ChrSzegedy
3
3
39
7,370