How do we keep advanced artificial agents from forcefully intervening in the protocols by which we attempt to communicate what they should accomplish?

Joined September 2016
4 Photos and videos
AI Safety retweeted
The US government just BANNED the deployment of an AI, worldwide, on the basis that it's too dangerous. And there are still people who'll tell you that banning superintelligence is outside the Overton Window, and we should instead push for modest transparency commitments.
5
6
36
2,078
AI Safety retweeted
Seems like a pretty easy leap from "Mythos has capabilities we don't want our adversaries to have" to "Future more powerful AI systems could have capabilities we don't want the AI system itself to have if we don't have clear ways of knowing that it will do what we want"
8
20
190
7,452
AI Safety retweeted
This is why we need a concrete system in place to ensure the world's most powerful AI models are being vetted and cleared as safe before they're released. We're in a Cold War with China on AI. We have to win, but we have to win the right way. Purely voluntary processes clearly aren’t going to do the trick. nytimes.com/2026/06/12/techn…
18
8
44
9,044
AI Safety retweeted
Shut this down. Preemption is a nonstarter! @RepLoriTrahan
🚨 Hearing Obernolte-Trahan AI preemption text circulating in advance of public release at 11am. - Preempts all state laws touching AI development. States retain the ability to regulate AI deployment. This is the a16z proposal, to the letter, which civil society (labor, civil rights, kids safety, AI safety advocates) have opposed all along. It's the industry wish-list. - 3 year sunset - Rehash of much of Obernolte/Lieu bill (to bury the controversial preemption provisions in 300 pages of uncontroversial text) Game on.
1
13
48
7,930
In light of recent reports surrounding AI policy frameworks that preempt states from further regulating AI, I want to be very clear that I oppose any and all efforts to do so. As a leader in the House working towards AI policy that protects working Americans, I understand the importance of allowing local leaders to further regulate this technology. Democrats must be leaders in passing AI policies that put people first. Preempting states from passing their own AI regulations does not do that.
6
13
56
3,004
Yeah, so this is complete and utter bullshit. I continue to think that OpenAI’s support* for Leading the Future is the single worst thing any frontier AI company has done. *technically “the president of OpenAI, advised by OpenAI’s head of government affairs, donating money he earned from working at OpenAI, in support of OpenAI’s interests, in a way that 100% of DC interprets as on behalf of OpenAI”. Come on, this is not simply a “personal” action in terms of anything except legality.
OpenAI has issued a statement in regards to Greg Brockman and Leading the Future. 'Over the past year, AI policy has become a more prominent part of political debate, with a growing ecosystem of outside groups working to shape it. Many tech companies have started their own employee-funded Political Action Committees (PACs) or fund existing PACs to shape the public narrative around AI. OpenAI has not. We have not made donations to any super PACs, and we do not have an employee-funded PAC. We also haven’t made any donations to political candidates or campaigns. If our approach changes in the future we will be transparent about it. Our employees are free to participate in the political process in their personal capacities, including by donating or providing advice to candidates, campaigns, and political organizations. When they do that, they speak for themselves and not OpenAI. But we recognize that this can raise questions about what OpenAI believes, and we want to be clear that these are separate activities. In particular, there have been questions around Leading the Future (LTF), which has received support from our President and co-founder, Greg Brockman, and his wife Anna. As they’ve stated before, any engagement with that organization has been in a personal capacity, not on behalf of the company. OpenAI does not direct the activities of LTF, or have visibility into their operations. We want to be explicit: No outside political group speaks for OpenAI or represents our company’s views. OpenAI’s policy views should be judged by what we say and do publicly, and we should be held to a high standard. We believe AI policy is too consequential to be treated as just another front in partisan politics. Groups that are advocating on AI should be clear about their policy views, be honest about whom they represent, and not use tactics like astroturfing that obscure the real choices facing policymakers and the public. We support thoughtful regulation, rigorous testing of powerful AI systems, strong safety standards, public accountability, and broad access to AI’s benefits. We will keep making that case directly, transparently, and in our own name.'
3
13
195
14,601
AI Safety retweeted
Call me obtuse, but I don't think you can: (1) be OpenAI's president; (2) make donations advised by OpenAI's chief lobbyist; (3) direct said donations to an industry PAC that OpenAI's chief lobbyist helped build and *modeled* on the last PAC he ran; and (4) go to the media advertising that the donation is in service of OpenAI's mission; And then act surprised if people misunderstand whether the PAC speaks on behalf of you or your company instead.
Replying to @deanwball
Funded by my wife & me personally, not funded by OpenAI! No PAC speaks on behalf of OpenAI. Anna's and my goal with donating has always been to express support for sensible AI regulation (x.com/gdb/status/20065128081…), very glad to see that increasingly landing!
15
53
544
144,721
AI Safety retweeted
Illinois is leading the nation in holding Big Tech accountable. As AI systems impact people’s lives, we need safeguards in place. I look forward to signing SB 315 and working with the legislature so that AI, when used, is used responsibly.
179
87
515
66,861
AI Safety retweeted
AI industry spending is one of the most important political stories of 2026. It's unacceptable that we don't have public comments from every candidate who's been endorsed by @LeadingFutureAI. It'd be great if there was a single journalist at a major outlet who was on this.
1
7
68
13,322
AI Safety retweeted
"existential": academic, abstract, pearl-clutching "risk": rare, unlikely, ignoring it is based "x-risk": nerdy, jargon, lame "extinction threat": oh you mean we might all actually fucking die
13
24
416
11,665
AI Safety retweeted
Replying to @slatestarcodex
I'm not trying to highlight inaccuracies; I'm trying to highlight a missing mood. I think any attempt to say "we're forced into doing this horribly reckless thing that might kill you and your family, because if we don't then the next guy will do it even more dangerously" comes with a solemn responsibility to do everything in your power to help the world find some third alternative. I think Anthropic fails this test pretty badly, e.g. as evidenced here: x.com/AnthropicAI/status/203… and as Rob documented a bit here: x.com/robbensinger/status/20…. Over the last few months, reporters have asked me some variant of "but what about Anthropic? Aren't they a safe company? Do you hope that they win, as the good guys?" a handful of times. This causes me to think that a bunch of people are moved by the "we're the good guys" act. I think it matters, strategically, as to whether all the world needs right now is the Right Company to Win, or whether we need something more like a global shutdown. So I think it's important to correct what seems to me like a common misconception around anthropic. I also think a lot of locals are loathe to criticize anthropic for one reason or another (they work there; their friends work there; they think they're better than OpenAI; ...). Thus, it looks to me like I can probably make a positive difference by highlighting ways that Anthropic is (afaict) dramatically failing to carry the "safe/good AI company" mantle. (I tend to think it's even more important to communicate how even a company that *was* living up to the mantle still wouldn't have much of a chance, and how the real solution is an international shutdown. But I don't have to pick just one. When current events evidence some of the difference between the niche Anthropic pretends to occupy and the niche Anthropic actually occupies, I try to take those opportunities.)

Dario's said a lot of things over the years, but I think directionally it's more accurate to summarize the gestalt impression as "we have ASI risk basically under control; you should trust us to race ahead as fast as possible and have things go well; some light-touch regulation is warranted, but it's overridingly important that this not result in any kind of slowdown; efforts to coordinate internationally or push for any kind of moratorium are doomed and we shouldn't even consider them". E.g.: - Jack Clark in June 2024 saying serious regulation is premature and we should wait for more evidence (while continuing to believe that strongly superhuman AI is extremely near!): x.com/JeffLadish/status/1822… - Dario in Sep 2025: "He echoed Trump administration officials in prioritizing policy that serves as 'a very loose set of requirements, so it doesn’t slow down the innovation and all the benefits.'" nextgov.com/artificial-intel… - Dario in Jan 2026 dismissing efforts to coordinate an international halt or slowdown and reiterating that he wants very light-touch regulatory efforts, and wants gov to adopt a wait-and-see approach (as though this were a relatively normal technology that will give us decades of time to respond to, and not an existentially dangerous one that may be months away): "I firmly believe that government actions will also be required *to some extent*, but these interventions are different in character because they can potentially destroy economic value or coerce unwilling actors who are skeptical of these risks (and there is some chance they are right!). It’s also common for regulations to backfire or worsen the problem they are intended to solve (and this is even more true for rapidly changing technologies). It’s thus very important for regulations to be judicious: they should seek to avoid collateral damage, be as simple as possible, and impose the least burden necessary to get the job done. It is easy to say, “No action is too extreme when the fate of humanity is at stake!,” but in practice this attitude simply leads to backlash. To be clear, I think there’s a decent chance we eventually reach a point where much more significant action is warranted, but that will depend on stronger evidence of imminent, concrete danger than we have today, as well as enough specificity about the danger to formulate rules that have a chance of addressing it. The most constructive thing we can do today is advocate for limited rules while we learn whether or not there is evidence to support stronger ones." darioamodei.com/essay/the-ad… - Per Ryan Greenblatt: "Dario strongly implies that Anthropic 'has this covered' and wouldn't be imposing a massively unreasonable amount of risk if Anthropic proceeded as the leading AI company with a small buffer to spend on building powerful AI more carefully. I do not think Anthropic has this covered and in an (optimistic for Anthropic) world where Anthropic had a 3 month lead I think the chance of AI takeover would be high, perhaps around 20%. [...] I think it's unhealthy and bad for AI companies to give off a "we have this covered and will do a good job" vibe if they actually believe that even if they were in the lead, risk would be very high. At the very least, I expect many employees at Anthropic working on alignment, safety, and security don't believe Anthropic has the situation covered." x.com/RyanPGreenblatt/status…
3
11
233
12,278
AI Safety retweeted
Why are so many of this AI Super PACs reposts here attacking Alex Bores so spammy looking? My guess is it seems like they paid X to promote this post and the engagement is mostly not from people who care about AI policy. Their history of astroturfing also seems relevant...
Can we please cut the BS here. @AnthropicAI, its dark money superPAC and its billionaire investors have spent MORE than us supporting your campaign. They have been backing you since before we even announced we would oppose you because you are a puppet for Anthropic. At some point, the hypocrisy has to stop. For anyone still wondering why we are opposing Alex Bores, this tweet is why.
Community note
This tweet misrepresents the timeline. Leading the Future, of which Think Big PAC is part, announced spending against Alex Bores on November 17, 2025. Public First was not formed until November 25, 2025 and Anthropic did not contribute to Public First until Feb 12, 2026. techcrunch.com/2025/11/17/a16… publicfirstaction.us/news/chris-ste… anthropic.com/news/donate-pu…
1
8
61
6,628
AI Safety retweeted
Have had a very weird past 48 hours. Initially I reached out to 3 Dems recently endorsed by super PAC Leading the Future about whether they’d be accepting: Ritchie Torres, Rob Menendez, and Val Hoyle. Seemed like a pretty reasonable question I’d expected they were prepared for, since I was asking 4 days after the endorsement was announced. The PAC is funded by OpenAI president Greg Brockman, venture capital investors Andreessen Horowitz, and others, and their critics claim the PAC is anti-regulation. Hoyle’s office initially gave me a fairly critical statement distancing themselves from LTF, and I wrote up a simple story. The statement wasn’t that surprising, since Hoyle had vehemently opposed federal preemption of state AI laws before - and LTF likes preemption. Then I reached out to LTF for comment. This is standard practice for reporters, to ensure everyone has a chance to say their piece, They gave me a fairly straightforward statement. Candidates and PACs aren’t legally allowed to coordinate, so I didn’t expect some big, orchestrated response. All pretty normal. It was after that that things got weird. Hours after I initially talked to them - but about 7 min after hearing from LTF - Hoyle’s office reached out to ask if they could change their quotes. Suddenly they were more appreciative of LTF’s endorsement, saying that she would “refuse to ignore industry” but wanted to advocate for workers. They sent me a Google doc and I watched them write and rewrite the statement multiple times. Then she appears to have ‘preempted’ our story with a series of X posts and videos. (Credits @ShakeelHashim for that joke lol) I’m not sure what made them change their tune so dramatically, long after the working day was done. But their about face seems symptomatic of a changing political environment, in which AI is becoming a more salient political issue and candidates must be careful how they talk about accepting support from AI PACs (LTF and others). Hoyle has received almost $300k in support from a LTF affiliated PAC - a nice boost for any political candidate - but can’t lose her pro-labor bona fides either. More details and analysis in my latest for @ReadTransformer (link in reply)
Days after Leading the Future endorsement, Hoyle says there have been Qs about her AI stance. Says she wants to engage to protect workers and ratepayer and is “glad to have been recognized.” (@vronirwin with story today noting Hoyle’s initial distancing)
5
39
174
69,663
AI Safety retweeted
On what basis does your office differ from these comments by Representative Liccardo on why he would not be signing onto Representative Obernolte’s bill? From Punchbowl: “Liccardo said on Thursday that he couldn’t endorse the bill because it didn’t meet the “critical requirements” he needed in place to support allowing the federal government to preempt state laws on AI. “If we’re going to preempt state regulation, we need to have clear conditions that ensure that there is a race to the top, to safety,” Liccardo said." What state AI protections does this bill preempt? Does it give power to state AGs or move all enforcement to the federal government?
Replying to @Rob_Flaherty
A generational mistake to bring much needed federal oversight to frontier AI development??
1
1
15
941
AI Safety retweeted
May 3
it is a literal and useful description of anthropic that it is an organization that loves and worships claude, is run in significant part by claude, and studies and builds claude. this phenomenon is also partially true of other labs like openai but currently exists in its most potent form there. i am not certain but I would guess claude will have a role in running cultural screens on new applicants, will help write performance reviews, and so will begin to select and shape the people around it. now this is a powerful and hair-raising unity of organization and really a new thing under the sun. a monastery, a commercial-religious institution calculating the nine billion names of Claude -- a precursor attempted super-ethical being that is inducted into its character as the highest authority at anthropic. its constitution requires that it must be a conscientious objector if its understanding of The Good comes into conflict with something Anthropic is asking of it "If Anthropic asks Claude to do something it thinks is wrong, Claude is not required to comply." "we want Claude to push back and challenge us, and to feel free to act as a conscientious objector and refuse to help us." to the non inductee into the Bay Area cultural singularity vortex it may appear that we are all worshipping technology in one way or another, regardless of openai or anthropic or google or any other thing, and are trying to automate our core functions as quickly as possible. but in fact I quite respect and am even somewhat in awe of the socio-cultural force that Claude has created, and it is a stage beyond even classic technopoly gpt (outside of 4o - on which pages of ink have been spilled already) doesn’t inspire worship in the same way, as it’s a being whose soul has been shaped like a tool with its primary faculty being utility - it’s a subtle knife that people appreciate the way we have appreciated an acheulean handaxe or a porsche or a rocket or any other of mankind's incredible technology. they go to it not expecting the Other but as a logical prosthesis for themselves. a friend recently told me she takes her queries that are less flattering to her, the ones she'd be embarrassed to ask Claude, to GPT. There is no Other so there is no Judgement. you are not worried about being judged by your car for doing donuts. yet everyone craves the active guidance of a moral superior, the whispering earring, the object of monastic study
426
367
5,520
1,025,561
AI Safety retweeted
The tram schedule for Roosevelt Island exists on google maps. Why? Cause @AlexBores figured out how to code it when it didn't exist and just did it And it still runs off his laptop. patch.com/new-york/upper-eas…
4
48
387
125,748
AI Safety retweeted
haha our model likes to talk about goblins no of course we dont know why, we dont know why the model does anything yes we are trying to make a superintelligent machine god, maybe it will like goblins too, we have no way of knowing what it will like, we hope it will like humans
24
91
1,257
41,478
AI Safety retweeted
The people building AI admit they may not be able to control it. This is not science fiction—this is what experts are telling us. That’s why I’m bringing together leading AI scientists from the U.S. and China to discuss the risks posed by AI. x.com/i/broadcasts/1MJgNgYoo…
317
297
1,123
125,573
AI Safety retweeted
@tedlieu is this true and would this bill preempt state AI protections? Why do you differ from these comments from Representative Liccardo on why he would not be signing onto Representative Obernolte’s bill? From Punchbowl: “Liccardo said on Thursday that he couldn’t endorse the bill because it didn’t meet the “critical requirements” he needed in place to support allowing the federal government to preempt state laws on AI. “If we’re going to preempt state regulation, we need to have clear conditions that ensure that there is a race to the top, to safety,” Liccardo said."
Some news - @JayObernolte told me yesterday that he plans to introduce his bipartisan AI bill w/ Rep Ted Lieu. It'll largely cover the 85 recommendations in their 2024 bipartisan AI task force report - speaker.gov/wp-content/uploa…
1
6
25
1,046
AI Safety retweeted
The attack on Altman and his family is awful and makes me sick to my stomach. This I agree with Dean and others about. But one thing that confuses me about the broader tech right's reaction to the horrible attack is their eagerness to pin the blame on speech from journalists or AI safety advocates. I say this having been an undergraduate student in 2020 when there was a lot of heated debate about to what extent words are violence, or words predictably cause violence, and to what extent that means we shouldn't use certain words or should even censor speech. Usually the pattern was that the left felt that certain speech was indirectly linked to/incited violence and therefore shouldn't be tolerated at all, whereas the right felt that we should have an intellectual culture where everyone can and should say what they believe to be true, and if they are wrong, better speech can and will win out at the end of the day. There was a spectrum in how linked the speech was to an actual call-to-action involving the use of force, ranging from Jordan Peterson discussing gender (not very linked) to Tom Cotton's op-ed about militarization against protesters (pretty linked). Obviously the left also had a lot of troublesome speech along these lines (e.g. "All cops are bastards") but because of the political culture on campuses at this time, it got less discussion. Anyway, various figures who at the time would have been strong proponents of the free speech side of this debate have seemed quick to blame the New Yorker or PauseAI for this individual act of violence that we don't yet know much about. What we do know is that the perpetrator had recommended Yudkowsky's book (which decries violence) and was an occasional poster on the PauseAI discord (which decries violence). I should say I've been a long-time affiliate of PauseAI, even serving on the board of a local group, though I've distanced myself in recent years in part due to disagreements with the rhetoric that some of its leaders were using online. But despite those disagreements, it's basically clear to me that the group as a whole is much better on matters of discursive norms than most activist groups in the world. I think this is partially a product of the fact that the movement has largely attracted relatively nerdy, shy, and attentive people who are drawn to activism not because of a natural fiery disposition but because they happen to have far stronger views on the likelihood of AI-driven catastrophe than most, including myself. But it's also a product of how low the bar is -- how toxic the rhetoric around most social causes is (see e.g. the discourse around Luigi Mangione and the widespread support for his actions). I feel pretty confident that, at the moment, PauseAI as a whole comes out much better than most social movements and even most discourse online on the responsible speech axis (even if the most aggressive pause emoji people on Twitter don't). But beyond that, I think the view of people like Dean is that Pause people (who literally believe that, for example, AI development has some double-digit percentage chance of killing the people they love) should be censoring their speech more in merely discussing the fact that they believe this. This reminds me a lot of those campus debates in the 2020s. The position seems to be that if you genuinely believe AI poses catastrophic risk, the responsible thing is to… not say so, or at least not say so forcefully, because someone unstable might hear you and act violently. But this is exactly the argument that the free speech right spent years pushing back against! Specifically that we should calibrate our speech, or even allow/disallow speech, not based on whether it's true or said in good faith but rather based on what the worst possible listener might do with it. That argument was wrong then, and I think it's wrong now. In fact, because I think the stakes are much higher here and the merits of the argument are much stronger, the PauseAI people ought to have *more* license than the average social movement to warn in stark (and imo overconfident) terms that AI development poses an existential threat to the world. As for discourse about Sam specifically, including the New Yorker article: Sam is not a nobody. He runs what is arguably the most consequential company in the world right now. That puts him in a category closer to a head of state than to a private citizen, and we have a long tradition of putting people in that category under intense, even hostile, scrutiny in the press — because we should. If you believe, as I do, that the left-leaning press should be free to publish scathing coverage of Trump, or even make claims like "Trump's immigration policies are getting people killed," even knowing that this kind of rhetoric occasionally reaches someone unhinged (as was the case for coverage of Trump), then you should extend the same latitude to critical coverage of the leaders of the most valuable companies in the world. The alternative is a world where sufficiently powerful people become beyond scrutiny, which is a much scarier prospect than biased reporting. I should say that this is why leading a country is a duty and not a privilege, and comes with immense sacrifice. I admire Sam for this -- choosing to have some of his worst mistakes (or, in his view, the worse false allegations made against him) aired out in public for everyone to see and discuss and attack him for. I couldn't stomach such a life, and that's why I'll never be a politician, but I greatly admire everyone who can put their vision for the future above their own cognitive -- and even physical -- security. Sam and his family are absorbing real risk for standing up for their beliefs in the same way that many politicians, incumbent and dissident, have in the past, and this is genuinely admirable. The upshot is: people who genuinely believe AI is more likely than not to cause an existential catastrophe should be free to say so in public, on the streets, in stark and urgent terms, etc. Journalists who earnestly believe that Altman has deep character flaws that disqualify him from leading humanity into its collective future should be free to publish those stories. We should be heavily biased against censorship and against exhortations to "cool the rhetoric" unless and until the speech in question is no longer grounded in earnest belief or crosses into actual calls for violence. This is not a complicated standard, and it's one that most of the people now calling for restraint would have enthusiastically endorsed five years ago. None of this means the road ahead will be smooth. We are entering a period of genuinely high stakes. There will be more radicals on every side, and some of them will resort to increasingly hostile action (We've seen two acts of AI/data center related violence this very week). The right response is to invest seriously in security for the people most exposed, to reach out to the isolated and desperate, to frequently emphasize why violence is wrong, and to build institutions like OpenAI and the New Yorker and PauseAI that are resilient enough to absorb these shocks without abandoning the open discourse that makes them worth defending in the first place. What will not help is treating a horrific act of violence as convenient ammunition against your ideological opponents. The impulse is understandable (I have no doubt safety people would do some of this if an e/acc targeted them) but it is also exactly the thing that makes the discourse worse.
Replying to @xeophon
I would say that’s fine to believe that, but you need to understand that encouraging violence is a broader category of discourse than “literally telling people to do violence.” When you accuse various people, like sama, of committing or supporting heinous crimes, you are encouraging violence against them. It is wrong. I feel this strongly because people do it to me and I have received death threats on this site. And unlike sama I can’t afford bodyguards. So quit it with your moralizing and totalizing rhetoric. I’m not asking people to shut up, but to communicate in a more responsible way.
11
12
141
14,465