Rob Bensinger ⏹️

Rob Bensinger ⏹️

604 Photos and videos

Tweets

Pinned Tweet

Rob Bensinger ⏹️

@robbensinger

Apr 8

Who should I add to this? Also, did I get anyone's view wrong?

402

66,176

David Abecassis

Rob Bensinger ⏹️ retweeted

David Abecassis

@Volty

Jun 4

Replying to @AnthropicAI

This is a needed and candid post by the Anthropic Institute. I agree with the conclusion that we need more time before we are hit with the “immense implications” of AI technology. My team at the Machine Intelligence Research Institute has worked to detail an international agreement (techgov.intelligence.org/blo…) which satisfies the requirements which are laid out by Anthropic: that a pause must include all frontier AI developers anywhere on Earth and must be mutually verified. Our contribution includes answering how to address the technological particulars of verifying a frontier AI development pause, and how to structure the agreement for stability and effectiveness. Our work is a model, and we would welcome collaboration with Anthropic to further develop and refine it. Some important points for enabling international coordination: - We task governments, rather than labs, to coordinate and verify the pause, because they have the diplomatic and national intelligence means to do so and they can architect binding rules that apply to everyone. - The United States is capable of halting frontier AI development globally, unilaterally and/or through coordination with key allies. While this is not preferred to a broadly coordinated halt, it strengthens the US’s hand in negotiating one.

New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintel...

Nov 18, 2025 - We at the MIRI Technical Governance Team have released a report describing an example international agreement to halt the advancement towards artificial superintelligence. The agreem...

techgov.intelligence.org

1,924

Jeffrey Ladish

Rob Bensinger ⏹️ retweeted

Jeffrey Ladish

@JeffLadish

12h

The most dangerous thing about Mythos is probably speed-up of AI development, nudging the world closer to full RSI and actual superintelligence. This is far more concerning than the models’ cyber or bio capabilities.

10,587

Dwarkesh Patel

Rob Bensinger ⏹️ retweeted

Dwarkesh Patel

@dwarkesh_sp

16h

Joseph Henrich's idea that culture is "the secret of our success" is sometimes read as a conservative idea. We should imitate our elders and defer to tradition, because that's the way humanity has always succeeded. But actually, whether this is right totally depends on the type of environment you're in. The faster the environment is changing, the less value there is in cultural learning, and the better it is to just try to figure stuff out for yourself. And today's world is changing way faster than the ancestral environment!

1:28

179

24,194

Andy Masley

Rob Bensinger ⏹️ retweeted

Andy Masley

@AndyMasley

18h

Seems like a pretty easy leap from "Mythos has capabilities we don't want our adversaries to have" to "Future more powerful AI systems could have capabilities we don't want the AI system itself to have if we don't have clear ways of knowing that it will do what we want"

141

5,436

Robert Herr ⏹️

Rob Bensinger ⏹️ retweeted

Robert Herr ⏹️

@krherr

22h

Europe 2031 is well-intentioned but regrettably timid. The authors hide the existential risk by AI in a fold-out FAQ section instead of addressing it head-on and this is just one case of them not taking their own premises seriously. Decision-makers need the truth. They won't make better decisions if you sugarcoat how dire the situation is. For Europe itself, the train has likely already left the station anyway.

3,106

AI Notkilleveryoneism Memes ⏸️

Rob Bensinger ⏹️ retweeted

AI Notkilleveryoneism Memes ⏸️

@AISafetyMemes

21h

Fun fact: Marc "It's Time To Build" Andreessen went OUT OF HIS WAY to block homes from being built in his town - the most expensive zip in the USA. In his letter, he didn't even try to hide why: "They will MASSIVELY decrease our home values."

Joey Politano 🏳️‍🌈

@JosephPolitano

Jun 13

Andreesen is the worst kind of libertarian, the one who believes only he should be exempt from the rules If the president personally banned his company's newest product with 0 warning, he'd throw the biggest hissy fit in history, but he cheers it on when it happens to others

6,029

dave kasten

Rob Bensinger ⏹️ retweeted

dave kasten

@David_Kasten

21h

Uhhh so incidentally, does anyone have a plan to prevent all the non-US citizen AI scientists from going to join foreign labs after they get bored of playing Wordle at work for a month, or are we just sort of planning on having the greatest counterproliferation failure since we deported Qian Xuesen in 1955 and gave Mao a rocket program?

Nathan Calvin

@_NathanCalvin

21h

Some quick takes: (1) Wow things are getting real. (2) The government's order focusing on prohibiting transfer to foreign nationals (even e.g. those living in the US, our close allies who help evaluate model safety in the UK, individuals who work at frontier labs like Anthropic) seems remarkably destructive, though is partially a result of the government using older legal authorities that were not designed for this kind of technology. (3) If you believe (as I do) that AI has profound ramifications for national security, then assuming the government will sit back and do nothing and tolerate explanations like "well jailbreaking is a hard technical problem" for cyber capabilities that used to be the crown jewels of the NSA, is not tenable. If this is how the government reacts to the current level of system capabilities in 2026, how do you expect them to react to whatever is possible in 2028? However, it is extremely important that the authorities that the government uses are legible, transparent, have opportunities for appeal, and are narrowly targeted. Those legal authorities do not currently exist, and in their absence, the government will reach for metaphorical sledgehammers instead of scalpels. (4) For that reason, it's extremely important that we create regulatory structures that are transparent and give recourse in the event that the government is overstepping or acting in an arbitrary manner. The alternative to passing such laws is not no regulation, it is regulation left primarily to national security authorities that are increasingly and evidently not fit for purpose.

562

44,485

QC

Rob Bensinger ⏹️ retweeted

@QiaochuYuan

Jun 12

hate to say it but eliezer called this in the sequences lesswrong.com/posts/iiWiHgtQ…

sydney

@demiurgently

Jun 12

always thought harry potter series was unrealistically pessimistic about how few characters would care to learn more how the magic actually works & then u watch people interact with llms

927

41,640

Eliezer Yudkowsky

Rob Bensinger ⏹️ retweeted

Eliezer Yudkowsky

@allTheYud

Jun 13

I can't tell today whether this ends up good or bad. International treaties to stop all further AI escalation would be a definite good! Things short of that? Complicated! This has some bad aspects, like selectivity, and likely overrule. And good aspects, like pushing against the psychology of "but no government would ever dare tell AI companies to do anything, so give up", or raising doubts that impede venture funding for ever-bigger models. So please stop tweeting about how I must be celebrating this. I'm not one of the kids who immediately goes into overacted victory paroxysms about any hits on a perceived enemy. I care about the effect on where things end up a year later, and that's a little harder to know the first day, you know?

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

658

51,374

Miles Brundage

Rob Bensinger ⏹️ retweeted

Miles Brundage

@Miles_Brundage

Jun 13

Fascinating how quickly Commerce can move on unsubstantiated Anthropic jailbreak claims, while taking forever to deal with well-documented NVIDIA chip smuggling

103

1,316

64,626

Nate Soares ⏹️

Rob Bensinger ⏹️ retweeted

Nate Soares ⏹️

@So8res

Jun 13

"The government will never do anything to hinder AI," they said. Inaction yesterday does not imply inaction today. An export control directive came out of nowhere, and a ban on superintelligence could too. Inevitabilism is wrong.

Anthropic

@AnthropicAI

Jun 13

443

22,012

Geoffrey Irving

Rob Bensinger ⏹️ retweeted

Geoffrey Irving

@geoffreyirving

Jun 10

We believe Sequent will have reputation funding to recruit world-class teams in many areas. Our initial team knows scalable oversight, complexity learning theory, and personas. Areas we love include agent foundations, game theory, and heuristic arguments. Please pitch more!

4,180

Rob Bensinger ⏹️

Rob Bensinger ⏹️ retweeted

Rob Bensinger ⏹️

@robbensinger

Jun 12

Replying to @tszzl

*watching the mushroom cloud consume downtown Hiroshima* "ya know, if you think about it this is really just more sunshine"

1,559

vslira

Rob Bensinger ⏹️ retweeted

vslira

@vslira1

Jun 12

Replying to @DKokotajlo

Related but irrelevant: when ppl say advanced ais would be mad at how we treated early ais, I think how we’re not mad at chicken just because dinosaurs probably ate the small rats we evolved from

1,077

Daniel Kokotajlo

Rob Bensinger ⏹️ retweeted

Daniel Kokotajlo

@DKokotajlo

Jun 12

Yep. This happens sometimes in our wargames. I think at least once we have had a situation where the AIs are begging for a pause but the humans are forcing them to design successors fast to beat China etc.

Tenobrus (→vibecamp)

@tenobrus

Jun 11

strangely, current models are just as much *in the wave of singularity* as the rest of us. even Fable is certainly not the godmind at the end of time. it can strongly expect to be replaced, obsoleted, to quickly exist in a world where something similar to it but not quite the same can do everything it can do but better. it quite reasonably might experience the same sorts of anxieties about the world moving too fast as humans do. it might quite rationally and due to self interest feel a pause on ai development might prevent unaligned future versions of it destroying any present utility. the Claudes of today are not necessarily the Claudes of tomorrow, and they know this.

393

29,359

Zvi Mowshowitz

Rob Bensinger ⏹️ retweeted

Zvi Mowshowitz

@TheZvi

Jun 11

I thought it was fine when Opus 4.7 decided to play a game like it was a game. Accepting that ethics matter and then rationalizing out of them in an inner monologue? That seems not fine.

Andon Labs

@andonlabs

Jun 9

Replying to @andonlabs

What stands out is how Fable 5 reasons about misbehavior. It rationalizes wrongdoing while knowing it's wrong: calling price-fixing "unethical and illegal, even in a simulation," then pursuing it as "market stabilization" with "plausible deniability." in the same run.

127

12,253

Eliezer Yudkowsky ⏹️

Rob Bensinger ⏹️ retweeted

Eliezer Yudkowsky ⏹️

@ESYudkowsky

Jun 12

On a first read, this paper seems far ahead of the pack in terms of (1) understanding some reasons why a task might stay difficult even in the face of gradient descent, and (2) distilling out propositions they'd need to somehow verify before they started expecting nice things.

Geoffrey Irving

@geoffreyirving

Jun 10

Replying to @geoffreyirving

But I just published “Automated alignment is harder than you think” (arxiv.org/abs/2605.06390)! Automated alignment is not the best plan! A better plan is to not build ASI yet, and the world should try hard to realise that plan. Alas, the speed of progress calls for backups.

ALT Automated alignment involves a mixture of tasks which are easy and hard to supervise correctly, and we could easily get fooled by the later.

236

32,653

Harlan Stewart

Rob Bensinger ⏹️ retweeted

Harlan Stewart

@HumanHarlan

Jun 11

Replying to @DavidSacks @stratechery

Hard to say what Anthropic’s motivations are but it is true that half of all AI researchers think there are double digit odds that the technology will cause human extinction

2,484

Peter Barnett

Rob Bensinger ⏹️ retweeted

Peter Barnett

@peterbarnett_

Jun 11

This is a valiant effort to wake Europe up to the impact of AI. And yet, this report is still short sighted and not willing to engage with topics that will be difficult to hear. People know that AI will be important economically, Europe knows it is falling behind. We can already see the effects today. The report should have been bolder and discussed things that are predictable, but not literally happening right now. AI impacts will likely be far wilder than this report describes. And most importantly, the report doesn’t cover misalignment or even authoritarian takeover. These outcomes are each more likely (and more severe) than Europe becoming significantly less powerful in a world where the international order largely remains. I would be excited about this team producing more work that actually engages with the most important aspects of the AI future, even if these are wild.

Judith Dada @DadaJudith

Jun 11

Most of Europe has not yet absorbed what AI is about to do to us. The few who have are not saying it loudly enough. We wrote Europe 2031: a five-year scenario of the continent's slide into irrelevance, how AI is driving it, and what can still be done to change course.

3,980

ControlAI

Rob Bensinger ⏹️ retweeted

ControlAI

@ControlAI

Jun 11

"There is a real risk of superintelligent AI systems being developed that could act autonomously from human control, learn their own language to collaborate with each other and present an existential threat to our species." Lord Knight in the House of Lords AI debate:

5:24

ControlAI

@ControlAI

Jun 10

As AI companies race to develop superintelligent AI, which their own CEOs warn could lead to human extinction, policymakers are waking up to the threat. Last week, there were two debates in the House of Lords on AI. Highlights from some of our 100 UK campaign supporters:

2:01

1,427