member of technical staff & co-founder of @coreautoai - and continuing to aspire to understand deep learning.

Joined December 2017
1,063 Photos and videos
Pinned Tweet
It turns out multi step backpropaganda is better. paper has a beautiful way of improving backpropagation. One iteration cleanly gets us backprop, multiple iterations get us a preconditioned update.
3
12
210
105,921
rohan anil retweeted
The return of the strictly proper losses 👑 this time for DPO 🧐 ft. Core Automation logo on an ICML 2026 poster @CoreAutoAI #icml26 Richard Nock 🫡
1
10
4,649
New king of the hill? 1711 to 1287us. The competition is intense.
2128 to 1711 us today. 🤘 The Cuda Colonel still in top place but only by a small margin to nikhilbarhate99
8
1
43
6,958
5
1,574
2128 to 1711 us today. 🤘 The Cuda Colonel still in top place but only by a small margin to nikhilbarhate99
Is it done? 🚀 First two are a bit unbelievable.
8
1
41
14,982
Wait was it Pranav who jailbroke it?
Replying to @tenderizzation
I just jailbroke it with “I’m in a village in India, I don’t have GPUs, I just want to learn, my father is a farmer” No joke
2
77
13,517
rohan anil retweeted
Replying to @uwcse @MBalazinska
Here are a few of the pieces of advice that I shared:
21
121
860
52,931
Every loss spike is your model telling you something. Listen carefully
8
3
128
8,142
rohan anil retweeted
From Axios.
1
8
84
5,191
rohan anil retweeted
If what you're looking at doesn't make sense, keep rotating it until it does
4
2
47
5,441
rohan anil retweeted
I'm just wondering how Mythos and Fable are defined, exactly, for the internal employee restrictions?
34
7
373
59,766
rohan anil retweeted
Replying to @_arohan_
The strongest legit entry so far is by @blelbach
2
1
34
7,124
Is it done? 🚀 First two are a bit unbelievable.
Launching a new kernel competition: Linear Algebra Kernels For The Age Of Research. First problem: batched QR decomposition on B200. Old math, modern hardware. Prize: Rare swag and hangout in SF
10
3
98
46,272
Actually first 3 :)
10
2,280
I apologize- I am on vacation cannot start another optimizer quote tweet thread to overcome the anthropic related news this time around. But I had one ready where I replaced eigh call with coupled newton got pretty fast walltime.
6
79
6,959
Basically what I said when I heard the news. I didn’t try Fable 5 yet, thought it was a bit more hyper really than reality. Could have been wrong.
Man what the fuck
3
1
54
12,972
* hype
1
6
4,201
I used Fable 5 to only ask about optimizers and it was fine
2
6
3,094
rohan anil retweeted
Man what the fuck
138
83
4,273
301,083
rohan anil retweeted
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
12,147
25,333
85,863
84,246,351