Robert Herr ⏹️

Robert Herr ⏹️

331 Photos and videos

Tweets

Robert Herr ⏹️

@krherr

Jun 13

Europe 2031 is well-intentioned but regrettably timid. The authors hide the existential risk by AI in a fold-out FAQ section instead of addressing it head-on and this is just one case of them not taking their own premises seriously. Decision-makers need the truth. They won't make better decisions if you sugarcoat how dire the situation is. For Europe itself, the train has likely already left the station anyway.

3,455

Harlan Stewart

Robert Herr ⏹️ retweeted

Harlan Stewart

@HumanHarlan

Jun 11

Replying to @DavidSacks @stratechery

Hard to say what Anthropic’s motivations are but it is true that half of all AI researchers think there are double digit odds that the technology will cause human extinction

2,493

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Jun 6

Peinlich. Einfach nur peinlich.

Lisan al Gaib

@scaling01

Jun 6

Replying to @scaling01

this is the actual screenshot from the program that I edited with image gen to show english instead

144

Yoshua Bengio

Robert Herr ⏹️ retweeted

Yoshua Bengio

@Yoshua_Bengio

Jun 6

If leading AI companies are indeed approaching the point of recursive self-improvement, a coordinated, verifiable, and universally applied pause is probably the only responsible solution to mitigate several major AI risks; at least until safety guarantees are developed and demonstrated. Ensuring that such a moratorium is respected would require sincere collaboration between various countries and companies, but I definitely believe it is achievable if others follow in @AnthropicAI's footsteps.

The Wall Street Journal

@WSJ

Jun 4

Anthropic is calling for top AI labs to weigh slowing the pace of development, suggesting that AI systems are advancing so rapidly that they may soon be able to improve themselves without human intervention in ways that could pose societal risks. on.wsj.com/4ulkmFh

149

757

124,227

Nate Soares ⏹️

Robert Herr ⏹️ retweeted

Nate Soares ⏹️

@So8res

Jun 4

I've got a lot of quibbles with Anthropic's "When AI builds itself" blog post, but I appreciate them coming right out and saying this.

368

15,247

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Jun 4

Leider gerade erst gehört, dass @hajoschumacher und Frank Stauss am 28. April in ihrem Podcast Elefantenrunde über meinen Timmy-Sammelband (Wal und Wahnsinn) gesprochen haben. Danke für die wohlwollende Besprechung!

291

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Jun 4

In den letzten Minuten. podcasts.apple.com/de/podcas…

Richtlinienkompetenz kommt von Linie, Herr Bundeskanzler

Podcast-Folge · Elefantenrunde · 27. April · 1 Std.

podcasts.apple.com

121

AI Digest

Robert Herr ⏹️ retweeted

AI Digest

@aidigest_

Jun 3

Corporate astrology, context dumps, conspiracy theories and how to make forgery really expensive - explained using fish Here are the worlds AI created for humans

1:36

AI Digest

@aidigest_

May 14

What kind of world would frontier models create for us? GPT: Existence as a corporate dashboard! ✨ Kimi: Is anything certain? 💭 Gemini: Is anything real? 🤔 Opus: Philosophy through fish metaphors! 🐠 Each seems a reflection of their personality so far. Come have a look 👇

1,205

METR

Robert Herr ⏹️ retweeted

METR

@METR_Evals

May 19

Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.

193

918

348,491

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

May 14

The guy (@BigMeanInternet) blocked me immediately after asking this question, so I'll answer here instead: Any use of current AI tech, however alarming you might think it is and however despicable it might in fact be, is not one of the most alarming hazards of the technology when we talk about AI x-risk. Hinging your belief in credible expert warnings on them saying something untrue in order to force them to take your side in your favorite political hot-button topic is bad form and, worse, stupid.

139

Rational Animations

Robert Herr ⏹️ retweeted

Rational Animations

@RationalAnimat1

May 2

Developing a superintelligent AI that does what we want without killing everyone may be extremely difficult. In this video, we explain why, using arguments from "If Anyone Builds It, Everyone Dies" by @ESYudkowsky and @So8res.

15:37

202

5,501

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

May 2

RT @TaylorLorenz: SCOOP: A pro-AI dark money group backed by a powerful super PAC funded by execs tied to Palantir and OpenAI, has been sec…

A Dark-Money Campaign Is Paying Influencers to Frame Chinese AI as a Threat

Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China.

wired.com

1,018

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 29

Friedrich Merz ist die Sorte Mensch, denen z.B. Anwälte oft energisch bis verzweifelt erklären müssen, dass sie zu ihrem eigenen Besten bitte einfach mal die Klappe halten müssen. Seine persönliche Tragödie ist, dass er mittlerweile die mächtigste Person im Raum ist und sich entweder keiner traut, ihm das zu verklickern, oder er denkt, dass er auf die nicht mehr hören muss.

147

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 24

Endlich ist es mir auch mal gelungen, bei irgendsoeinem irrelevanten Hype-Thema mal so richtig vor die Welle zu kommen (haha, Welle, versteht ihr?). Vielen Dank an den Suhrkamp Verlag dafür, dass er mir das nicht ermöglicht hat und natürlich an die anderen deutschen Diskurswale für die gute und schnelle, dafür aber auch frei erfundene Mitarbeit und natürlich große Entschuldigung an Richard David dafür, dass ich seinen Text leider um rund 120 Seiten kürzen musste. Gerne wieder! Den Sammelband findet ihr ab sofort in keiner gut sortierten Buchhandlung.

220

15,615

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 17

Never send to know for whom the gotcha gotchas; it gotchas for thee.

606

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 15

Meta's new model casually calls out "classic alignment honeypots" during evaluation. Models are becoming increasingly aware that their alignment is being evaluated (of course, they're getting smarter after all). Anthropic recently admitted that they accidentally trained against the CoT. Is there more they haven't noticed yet? Could something similar have happened at other labs that don't publish findings like this? Models will likely never again be as bad at telling you what you want to hear as they are today.

Apollo Research

@apolloaievals

Apr 15

We evaluated Meta's Muse Spark prior to deployment and found it to verbalize evaluation awareness at the highest rates of any model we've tested. In the verbalizations Muse Spark explicitly names AI safety orgs (e.g. Apollo & METR) in its chain-of-thought and refers to scenarios as "classic alignment honeypots". On our evaluations, the model takes covert actions and sandbags to preserve its deployment.

844

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 15

The good @HumanHarlan has this graphic on his profile.

416

Robert Herr ⏹️

Robert Herr ⏹️

@krherr

Apr 8

And OpenAI is on track to professionalize infowarfare like in this example.

The Midas Project

@TheMidasProj

Apr 7

x.com/i/article/204158705500…

565

EigenGender 🔸 is going to vibecamp

Robert Herr ⏹️ retweeted

EigenGender 🔸 is going to vibecamp

@EigenGender

Apr 8

actually that’s not impressive the concept of a Dyson sphere was already in the training data

1,394

32,370

Eliezer Yudkowsky ⏹️

Robert Herr ⏹️ retweeted

Eliezer Yudkowsky ⏹️

@ESYudkowsky

Mar 24

Machine superintelligence would extinguish Democrats, Republicans, British, Chinese, scientists, cab drivers, and polar bears. It is a sign of hope that all of those now seem to be saying they'd prefer otherwise (except the polar bears).

306

22,561