The Center for Brains, Minds and Machines is a multi-institutional NSF Center dedicated to the study of the science and engineering of intelligence.

Joined January 2017
246 Photos and videos
A common practitioner belief mechanistically: “weight decay stabilizes training.” We show it is wrong technically, a bit real In practice. - Training still happens at EoS with weight decay. - That said "it doesn't look like it"! The curvature at stabilization is much lower! - We properly charcaterize all we see by extending to weight decay the self-stabilization by @alex_damian_, @EshaanNichani, and @jasondeanlee and matching the result cleanly with an underdamped harmonic oscillator! - This however stabilizes/tames the dynamics in function space. Importantly: - This shows that regularized training at EoS even though "doesn't look like that", meaning curvature measures are lower than thresholds. Check out our paper: arxiv.org/pdf/2605.16622
2
11
84
5,841
May 30
Check out the latest work of our center, in collaboration with @TAMU! Towards theorizing the boost in capabilities of agent systems. @PierBeneventano @GalantiTomer
Have you ever wondered how to formalize what an agentic system actually is? Meaning where they fit in the book of ML and how to explain/predict their performance? We argue here, agents can be seen as boosting reasoning models! arxiv.org/abs/2605.14163
2
6
779
Thanks a lot for sharing our work! On top of the things mentioned! We also give a very nice mathematical framework and mathematical results about agent systems :) With the amazing Varun, Riccardo, Tommy, @GalantiTomer
May 18
NEW paper worth reading. GPT-5.4 nano plus a critic-comparator orchestration loop hits 76.4% on SWE-bench Verified, matching standalone Gemini 3 Pro and Claude Opus 4.5 Thinking. The trick is to select from k=8 weak-model proposals using execution and proof signals. What does this mean? Many of the patches you'd expect from a frontier model are already inside a weak model's top-8 candidates. When you have 8 candidate patches from a weak model, don't ask the model which is best. Run them and verify them. That's enough to match a frontier model's accuracy. The takeaway for AI devs: a weak model's top-k often already contains the right answer. What limits you is the quality of your selector, not the capability of the model. Paper: arxiv.org/abs/2605.14163 Learn to build effective AI agents in our academy: academy.dair.ai/
4
33
4,244
CBMM retweeted
1/ Many optimization problems are hard in theory. But real OR and NP-hard instances often have exploitable structure. Can an LLM agent discover that structure automatically and turn it into faster solver code?
5
35
205
25,271
This is a project I’m very excited about. Back in the days the smartest computer scientists were finding the efficient ways to solve their problems. We made the agents do this work here.
1/ Many optimization problems are hard in theory. But real OR and NP-hard instances often have exploitable structure. Can an LLM agent discover that structure automatically and turn it into faster solver code?
8
58
5,905
Our new paper was accepted at ICML! 1) Momentum isn’t just “SGD but faster”. It affects sharpness (of orders of magnitude!) 2) The usual story says momentum lets you train in sharper regions. That’s true for large batches only! The opposite is true for minibatches!
3
14
113
7,630
Muon leads to severely miscalibrated models! This is just one of the results of this new paper of ours: In “Too Sharp, Too Sure” we show calibration error tracks loss curvature during training and we tie both to margin tails.
7
47
454
83,638
Apr 10
[blog] What is Intelligence? Or "Distinguishability is All You Need" Here are several related questions to which we do not have a good answer: How will we know when we've achieved "Artificial General Intelligence" (AGI)?... poggio-lab.mit.edu/blogsupda…
211
[video] "Intelligence as Prediction: Cybernetics, LLMs, and Sociality" Speaker: Blaise Agüera y Arcas - Google, Paradigms of Intelligence youtu.be/6NC0tSjZXBo
3
17
1,195
Mar 29
[blog post] "PoggioAI/MSc Went Online" This first public release is an open-source, customizable, modular multi-agent system for academic research workflows, with a current emphasis on machine learning theory and nearby quantitative fields. poggio-lab.mit.edu/blogsupda…
2
444
Check the blog of Poggio Lab at MIT! We went online with some very nice blogs! The last one being about our multiagent system: poggio-lab.mit.edu/blogsupda…
3
6
930
Most AI for research work tries to maximize autonomy first and patch quality later. We think the near-term path is the reverse: Automating step-by-step holding the quality bar fixed. Today we’re open-sourcing PoggioAI/MSc for ML Theory Research
1
8
30
40,487
CBMM retweeted
Simply adding Gaussian noise to LLMs (one step—no iterations, no learning rate, no gradients) and ensembling them can achieve performance comparable to or even better than standard GRPO/PPO on math reasoning, coding, writing, and chemistry tasks. We call this algorithm RandOpt. To verify that this is not limited to specific models, we tested it on Qwen, Llama, OLMo3, and VLMs. What's behind this? We find that in the Gaussian search neighborhood around pretrained LLMs, diverse task experts are densely distributed — a regime we term Neural Thickets. Paper: arxiv.org/pdf/2603.12228 Code: github.com/sunrainyg/RandOpt Website: thickets.mit.edu
90
455
3,162
767,654
Mar 17
[blog] Beneficial Misalignment: Why We Shouldn't Always Align AI to Humans In the rapidly evolving field of NeuroAI, a significant amount of energy is dedicated to 'alignment', the idea that representations from artificial intelligence should converge... poggio-lab.mit.edu/blogsupda…
1
10
691
Mar 11
[blog post] A Conversation with Blaise Agüera y Arcas: On Intelligence, Life, and the Future of AI What does it mean to call something intelligent - and when did this question get so hard to answer? For Blaise Agüera y Arcas, VP at Google and founder... poggio-lab.mit.edu/blogsupda…
1
1
6
936
[blog post] Can a Neural Network Think Before It Speaks? Somewhere around 2022, an observation started making the rounds among researchers working with large language models: if you just asked a model... poggio-lab.mit.edu/blogsupda…
7
629
Feb 26
[blog post] Edge of (Stochastic) Stability made simple — Part II: the mini-batch case In Part I we had one landscape and a deterministic update. Now we have a distribution of mini-batch landscapes and a stochastic update... poggio-lab.mit.edu/blogsupda…
1
292
Feb 20
[blog post] Edge of (Stochastic) Stability made simple — Part I: A crash course on (full-batch) Edge of Stability In this part I introduce the phenomenon and what I believe are the two key mechanisms—which we’ll use as the springboard for the mini-bat... poggio-lab.mit.edu/blogsupda…
2
7
598
Feb 13
[blog post] Are Transformers Just "Stochastic Parrots"? A common criticism of Large Language Models (LLMs) is that they are merely "stochastic parrots"—statistical mimics that stitch together likely patterns without genuine reasoning... poggio-lab.mit.edu/blogsupda…
1
6
411
CBMM retweeted
🧵 New paper: LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search arxiv.org/abs/2510.14331 We use reasoning LLMs to learn tasks like IsPrime from ~200 samples by proposing short programs, making both the learned function *and* the learning process interpretable 🤯
2
10
35
8,310