Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.

Joined June 2010
472 Photos and videos
Chris Olah retweeted
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
1,771
4,662
28,647
18,490,353
Chris Olah retweeted
Anthropic now has a team dedicated to AI and the rule of law — and we've just opened our first role. @AnthropicAI has studied what AI means for the economy. This team asks a different question: what will it mean for executive power, for courts and elections — and for the public deliberation that constitutional democracy ultimately rests on? We're looking for someone with real depth in both AI and the law — a legal scholar, political scientist, or experienced government hand who can reason about frontier systems and the institutions they will affect. If that's you, or someone you know: job-boards.greenhouse.io/ant…
64
113
995
151,124

156
557
2,939
272,747
The questions posed by AI are bigger than the AI community. We urgently need the world – religions, civil society, academics, governments – to participate in creating a positive outcome. I'm glad the Catholic Church is engaging, and honored to speak at the presentation.
Pope Leo XIV’s first encyclical, Magnifica humanitas, on preserving the human person in the age of artificial intelligence, will be released on May 25. A presentation event with the Pope and various speakers is scheduled for the same day at the Vatican. vaticannews.va/en/pope/news/…
61
167
1,338
129,680
Chris Olah retweeted
Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith. It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable. Listen at anthropic.com/constitution
434
374
3,061
468,580
Chris Olah retweeted
I’m proud that so many of the world’s leading companies have joined us for Project Glasswing to confront the cyber threat posed by increasingly capable AI systems head-on. x.com/AnthropicAI/status/204…

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing
648
672
12,376
1,070,595
Chris Olah retweeted
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing
1,986
6,648
44,012
31,421,699
Chris Olah retweeted
New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.
1,037
2,675
17,761
3,900,443
Chris Olah retweeted
Very proud of this amicus brief filed yesterday in the @AnthropicAI case against the Department of War from Catholic moral theologians and ethicists. The very notion of what it means to have a just war is at stake in how we respond to these matters. courtlistener.com/docket/723…
13
22
89
17,434
Chris Olah retweeted
I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn’t an easy call. AI has an important role in national security. But surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got. This was about principle, not people. I have deep respect for Sam and the team, and I’m proud of what we built together.
1,870
12,731
58,288
7,688,281
I am humbled to be among the courageous leaders from Abrahamic religious traditions who put out this important statement about the @DeptofWar-@AnthropicAI dispute: faithfamilytech.org/moral-gu…. A short summary of the substance:
2
5
23
9,332
Chris Olah retweeted
We partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025.
475
1,366
14,998
3,229,655
Chris Olah retweeted
A statement from Anthropic CEO Dario Amodei: anthropic.com/news/where-sta…
1,088
713
5,497
2,701,869
Chris Olah retweeted
I've decided to leave OpenAI. I'm incredibly proud of all the work I've been part of here, from helping create the reasoning paradigm with @MillionInt, scaling up test-time compute with @polynoamial, working on RL algorithms with my fellow strawberries, shipping o1-preview (which started life as of one of my derisking runs), to post-training o1 and o3 with @ericmitchellai, @yanndubs and many others. I'm most proud of having led the post-training team here for the last year -- the team has done incredible work and shipped some really smart models, including GPT-5, 5.1, 5.2, and 5.3-Codex. OpenAI has genuinely some of the most talented researchers I have ever met, and I have learned more than I could have imagined knowing since I joined as a new grad. I want to thank @markchen90 @FidjiSimo @sama @merettm for all their support over my time here, and too many collaborators to name for the insights, ideas, and just plain fun we have had working together. After leading post-training for a year, though, I'm longing to start fresh and return to IC research work. I've been thinking about going back to technical research for quite some time, and I genuinely believe my colleagues and team here are set up to succeed going forward without me. I'm personally very excited for my next chapter -- I'm proud to be joining @AnthropicAI to get back into the weeds in RL research, and I'm looking forward supporting my friends there at this important time. Many of people I most trust and respect have joined Anthropic over the last couple of years, and I'm excited to work with them again. I have also been very impressed with Anthropic's talent, research taste and values, and I'm excited to be part of what the company does next!
601
1,205
21,044
3,170,942
Chris Olah retweeted
This isn't true. Anthropic hasn't offered a "helpful-only" model without safeguards for NatSec use. Claude Gov is a custom model with extra training, including technical safeguards. (We've also had FDEs and researchers implementing it, and we run our own classifier stack.)
17
36
545
130,428
Very grateful to all the natsec law experts who are taking time over the weekend to provide independent legal commentary in this moment. A few that I've noticed (no doubt missing many)...
12
40
524
77,666

A deep dive in @lawfare on the many legal problems with the Pentagon's designation of Anthropic as a supply chain risk.
10
5,455
Chris Olah retweeted
A deep dive in @lawfare on the many legal problems with the Pentagon's designation of Anthropic as a supply chain risk.
12
35
203
108,569