Daniel Kunin

Daniel Kunin

38 Photos and videos

Tweets

Pinned Tweet

Daniel Kunin @KuninDaniel

Apr 24

For the last few years, a lot of my work has been driven by the feeling that deep learning is not magic — there are principles, mechanisms, and laws waiting to be understood. This paper is our attempt to say that clearly!

Jamie Simon @learning_mech

Apr 24

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

9,313

Daniel Kunin

Daniel Kunin @KuninDaniel

Jun 5

New work by @_AmilDravid, extending his really neat work on Rosetta Neurons through the lens of scaling laws! A nice example of how mech. interp and learning mechanics can complement one another, bringing different perspectives that together lead to deeper insights

Amil Dravid

@_AmilDravid

Jun 5

Scaling laws describe how loss changes with scale. Do neurons inside models change predictably too? We study vision and language models up to 30B params and find systematic scaling in neuron universality, specialization, and selectivity. Paper code: avdravid.github.io/rosetta-n… 1/n

0:14

991

Adam Shai

Daniel Kunin retweeted

Adam Shai

@adamimos

Feb 23

A longstanding dream of interp is to decompose activations into distinct, interpretable parts. But when should we expect that to work, and what even are such parts? New from Simplex: transformers factor their world into orthogonal subspaces, even when it costs accuracy.🧵👇

602

74,668

Daniel Kunin

Daniel Kunin @KuninDaniel

May 1

Excited to share that our paper “Sequential Group Composition: A Window into the Mechanics of Deep Learning” was accepted to ICML 2026 in Seoul! Co-led with @giovannimarchet and @AdeleMyersPhD @hopfbifurcator @ninamiolane Paper: arxiv.org/abs/2602.03655

Sequential Group Composition: A Window into the Mechanics of Deep Learning

How do neural networks trained over sequences acquire the ability to perform structured operations, such as arithmetic, geometric, and algorithmic computation? To gain insight into this question,...

arxiv.org

238

72,250

more replies

Daniel Kunin

Daniel Kunin @KuninDaniel

May 1

Yes — by leveraging associativity. We explicitly construct efficient solutions: RNNs can compose elements sequentially in k steps, while deep MLPs can compose adjacent pairs in parallel in log k layers and we find preliminary evidence that GD can discover these solutions!

921

Daniel Kunin

Daniel Kunin @KuninDaniel

May 1

For me, this paper is learning mechanics in action! Mech interp first identified that NNs use Fourier features in algebraic tasks - great work @bilalchughtai_ @justanotherlaw @NeelNanda5 Learn mech asks why training produced those features, in that order, with that architecture

678

Sebastien Bubeck

Daniel Kunin retweeted

Sebastien Bubeck

@SebastienBubeck

Apr 26

From "Mathematical theory of deep learning: Can we do it? Should we do it?" to "There Will Be a Scientific Theory of Deep Learning". It's respectively the title of a talk I gave four years ago, and the title of an arxiv paper from four days ago. I really like the "learning mechanics" perspective (think of it as a continuation of "statistical mechanics", "quantum mechanics", and so on). Several of my last academic papers can be viewed under that lens (e.g. Learning threshold neurons via the “edge of stability”; or LEGO). I'm not as optimistic as the authors of the recent arxiv paper that we will EVER be able to reach what the "physics mechanics" field have achieved, but it's certainly worth trying. Talk: youtu.be/3uRD_lg701k?si=yjLY… Paper: arxiv.org/abs/2604.21691

Mathematical theory of deep learning: Can we do it? Should we do it?

Extended motivational speech to study deep learning mathematically....

youtube.com

426

53,038

Krzakala Florent

Daniel Kunin retweeted

Krzakala Florent @KrzakalaF

Apr 25

Wir mussen wissen. Wir werden wissen

Statistics (Machine Learning) Papers @StatsPapers

Apr 24

There Will Be a Scientific Theory of Deep Learning Jamie Simon, Daniel Kunin, Alexander Atanasov, Enric Boix-Adserà, Blake Bordelon, Jeremy Cohen, Nikhil Ghosh, Florentin Guth, Arthur Jacot, Mason Kamb, Dhruva Karkada, … arxiv.org/abs/2604.21691 [𝚜𝚝𝚊𝚝.𝙼𝙻 𝚌𝚜.𝙻𝙶]

In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (a) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (b) tractable limits that reveal insights into fundamental learning phenomena; (c) simple mathematical laws that capture important macroscopic observables; (d) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (e) universal behaviors shared across systems and settings which clarify which phenomena call for explanation. Taken together, these bodies of work share certain broad traits: they are concerned with

ALT In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (a) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (b) tractable limits that reveal insights into fundamental learning phenomena; (c) simple mathematical laws that capture important macroscopic observables; (d) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (e) universal behaviors shared across systems and settings which clarify which phenomena call for explanation. Taken together, these bodies of work share certain broad traits: they are concerned with

3,383

Arthur Jacot

Daniel Kunin retweeted

Arthur Jacot @ArthurJacot3

Apr 24

Very happy to be part of this project. We've compiled the main reasons why a Theory of Deep Learning is possible if not inevitable!

Jamie Simon @learning_mech

Apr 24

2,001

Cengiz Pehlevan

Daniel Kunin retweeted

Cengiz Pehlevan @CPehlevan

Apr 24

Great perspective on the theory of deep learning from a stellar group of authors!Physics-inspired ideas will play a central role in shaping this field. Congrats to my group alumni @blake__bordelon and @ABAtanasov for their contributions here and across many influential papers.

Jamie Simon @learning_mech

Apr 24

3,552

Daniel Kunin

Daniel Kunin @KuninDaniel

Apr 24

100% agree. Neuroscience embraces studying the brain at multiple levels — computational, algorithmic, and implementational. I’m excited to see deep learning moving toward the same conversation, with theory and interpretability informing each other!

Eric J. Michaud

@ericjmichaud_

Apr 24

It's been so heartening to see deep learning theory folks engage seriously with interpretability recently, and I hope these two communities can talk much, much more. We should seek a unified understanding of neural networks across many levels of analysis.

1,741

Surya Ganguli

Daniel Kunin retweeted

Surya Ganguli

@SuryaGanguli

Apr 24

Great to see the next generation taking the lead in the science of deep learning! Also proud that two brilliant members/alumni of my group are a part of this: @KuninDaniel & @MasonKamb

Jamie Simon @learning_mech

Apr 24

100

10,732

Stat.ML Papers

Daniel Kunin retweeted

Stat.ML Papers @StatMLPapers

Apr 24

There Will Be a Scientific Theory of Deep Learning ift.tt/FIXLaes

There Will Be a Scientific Theory of Deep Learning

In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process,...

arxiv.org

317

28,380

Daniel Kunin

Daniel Kunin @KuninDaniel

Apr 24

Was definitely intimidated going in, but this ended up being a lot of fun — thanks to @kanjun and @joshalbrecht for being such great hosts!

Imbue

@imbue_ai

Apr 24

Deep learning works extraordinarily well. And we still largely don't know why. A new paper from @learning_mech, @KuninDaniel, and 12 co-authors argues that a scientific theory of deep learning is emerging, and coins a name for the emerging field: learning mechanics. We sat down with Jamie and Dan on Generally Intelligent to talk about what a physics of deep learning would actually look like, why now, and what's left to figure out. 3:05 Learning mechanics as the physics to mechanistic interpretability's biology 4:13 Why deep learning needs a theory 7:07 Why deep learning is uniquely hard to engineer 12:11 How a week in the woods became a paper 25:59 The barrier to theory isn't opacity, but complexity 36:26 Deep learning's first gas law 47:22 Why more particles makes the problem easier 56:22 The discretization hypothesis 1:01:50 The strongest signal that a compact theory exists 1:05:07 The Platonic Representation Hypothesis 1:15:41 Why learning mechanics and mech interp need each other 1:25:29 Theory as safety infrastructure

1:33:22

399

Josh Albrecht

Daniel Kunin retweeted

Josh Albrecht

@joshalbrecht

Apr 24

AI and deep learning aren't magic. Someday people will look back and laugh at how little we understood about these technologies.

Jamie Simon @learning_mech

Apr 24

1,948

Kanjun 🐙

Daniel Kunin retweeted

Kanjun 🐙

@kanjun

Apr 24

There will be a scientific theory of deep learning 👇 One of the most interesting and accessible papers I’ve read on deep learning theory released today. It names the field of “learning mechanics” — if mechinterp is the biology of LLMs, learning mechanics is the physics.

Jamie Simon @learning_mech

Apr 24

156

22,795