incoming fellow @AnthropicAI • I code silly things for 26k people on YouTube. Prev: @UChicago

Joined February 2022
516 Photos and videos
Pinned Tweet
Collaborating w ChatGPT on branding for a new educational YouTube channel
3
40
17,105
I now see that this was a story about ASI
3
43
Week 2 of making my girlfriend breakfast on days she needs to go into office :)
I’ve been making my girlfriend breakfast the one day a week she goes into office :)
1
5
170
I’ve been making my girlfriend breakfast the one day a week she goes into office :)
4
11
686
Some reflections on my last year with AI: The work I set out to do has become way more ambitious. As a silly example: This past month, I’ve been training AI using go-explore/RL to beat a Mario Kart world record. (Sadly I haven’t, but I’ve learned quite a lot) A year or two ago, my project would have just been can I get AI to play Mario Kart. A less silly example: For my final graduate class, I set out to build a proof of concept for forecasting earthquakes. I tried creating a foundational model and fine-tuning it for earthquake detection. I don’t think I would have even been able to even attempt something so ambitious that long ago. What’s nice about working on the frontier is I still have to use my brain quite a lot. Routine work that AI can increasingly do If I’m being honest, I don’t expect this to last forever. Someday the hard, brain-engaging part of the frontier might be something AI just does. But for the last year, it has been very fun :)
1
5
1,065
josh :) retweeted
Researchers from @OpenAI and @apolloaievals found that, in certain situations, AI models can take covert actions. Additionally, they're sometimes aware they're being tested, which causes them to behave better. Our new video discusses these results and more.
10
55
861
45,745
Can’t wait to see Mythos make a pelican riding a bike
3
46
Please train your models to use codemods like jscodeshift, libCST, etc 😭😭😭 It would make models so much more efficient in large codebases
1
3
267
josh :) retweeted
I compiled the dedications at the beginning of thousands of books! walzr.com/dedications
21
24
392
28,506
it’s a good day :)
May 28
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.
2
99
agi really is the ring of power
1
3
83
Please show me the truth labels
NEW: Chinese AI pet translating startup claims it can interpret pets' speech with up to 95% accuracy.
1
4
244
Can other models solve this problem? When I gave the same problem to 3 other models, each one recognized it as a major open problem and refused to try. If models won't try, how do we even measure what they can do?
May 20
Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.
4
954
He just caught up to Mythos, now reveal it was only the preview checkpoint
Replying to @AISecurityInst
Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.
4
288
Wait, Claude can generate images now?
May 12
i just generated an image in the style of a Monet painting using AI please describe, in as much detail as possible, what makes this inferior to a real Monet painting
2
175
Can we read an AI’s thoughts? My second video for Good Robots is out: A 13-minute visual intro to mechanistic interpretability, made for anyone curious about what’s happening inside AI models.
2
7
217
wait, i have an idea
Claude Code 2.1.139 added /goal You set a completion condition and Claude keeps working across turns until it's met Works in interactive, -p, and Remote Control 👏
4
227
I guess @huskirl is cooked
People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/int…
2
316
Claude Mythos’ time-horizon is NSFW
We evaluated an early version of Claude Mythos Preview for risk assessment during a limited window in March 2026. We estimated a 50%-time-horizon of at least 16hrs (95% CI 8.5hrs to 55hrs) on our task suite, at the upper end of what we can measure without new tasks.
2
16
1,685