Cas (Stephen Casper)

Cas (Stephen Casper)

452 Photos and videos

Tweets

Pinned Tweet

Cas (Stephen Casper)

@StephenLCasper

Mar 24

I'm extremely excited to be on the organizing committee this year for my favorite workshop ever! Submissions (up to 8 pages) are due April 24! Co-submission with ICML and NeurIPS is encouraged! taigr-workshop.com/

TAIGR @ ICML 2026 — Workshop on Technical AI Governance Research

Second Workshop on Technical AI Governance Research at ICML 2026. Bridging ML researchers and policymakers in Seoul, South Korea.

taigr-workshop.com

Technical AI Governance @ ICML 2026 @taig_icml

Mar 24

🚨📢Announcing the second Technical AI Governance Research (TAIGR) workshop @icmlconf. Accepting submissions (up to 8 pages) until April 24 on technical topics in AI governance! #icml2026

15,519

Brad Carson

Cas (Stephen Casper) retweeted

Brad Carson

@bradrcarson

Jun 13

We need better expertise. That's why a powered up CAISI (or even a National AI Laboratory as called for by the @scientistsorg) would have saved a lot of pain.

Andrew Curran

@AndrewCurran_

Jun 13

Replying to @peonyKingOF

It was reported by Amazon.

4,501

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 12

Now that I have your attention by posting this spinning point cloud GIF, I'd like to propose a litmus test for AI mechanistic interpretability research. You might call it the "interp hammer" test. If the things achieved by a mechanistic interpretability research project had been accomplished by an undergraduate who was (1) just prompting a model or (2) doing exploratory data analysis, would it impress you? If not, the project might not have accomplished much.

118

6,826

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 11

Glad to join Doom Debates with Liron! And yes -- if I could press a button and stop research on "superalignment", "scalable alignment", and "scalable oversight" research, I would. (I might even do it for mechinterp too.) youtube.com/watch?v=0XVmtazg…

This Harvard Professor Says AI Alignment Will BACKFIRE - Dr. Stephen...

Stephen Casper is an incoming professor of public policy at the Har...

youtube.com

11,687

Cas (Stephen Casper)

Cas (Stephen Casper) retweeted

Cas (Stephen Casper)

@StephenLCasper

Jun 10

Replying to @JacquesThibs @liron

My reasoning behind wanting to stop superalignment research: - Principal-agent alignment is neither necessary nor sufficient for safety and ecosystemic health. I think the vast majority of our problems will come from scenarios involving systemic harms, negligence, or malice instead of situations in which someone benevolent was exercising what would be considered “best technical practices” at the time but non-foreseeably loses control of their AI. - I think that current alignment, control, and containment strategies are actually pretty good and there is a big incentive for ML people to underemphasize the effectiveness of these tools to justify their existence. If you’re willing to pay a safety tax and are not “move fast and break things,” existing best practices can make systems pretty robustly safe. - Superalignment is pretty safety washed and is touted by big companies to justify their ambitions to build the superintelligence. - Solving superalignment would be a huge boon and would consolidate enormous power in big tech. This is itself a risk factor. - Jevons paradox — it’s easy to see how lowering the perceived risk of building superintelligence would make more companies choose to try.

3,076

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 10

Wonderful talking with crux artist and all-around smart guy, @liron. Thanks for having me! Check out the debate for discussions on doom, disinformation, disempowerment, deepfakes, and demagogues!

Liron Shapira

@liron

Jun 10

NEW: Harvard Professor @StephenLCasper (a.k.a. Cas), a computer scientist working on AI safeguards and governance research, worries that alignment efforts will backfire! 🤦‍♂️ Today on Doom Debates, we agree that slowing down AI development would make the world safer… but you know Cas's position is unique when he says he’d prefer to have *less* research on AI alignment! We cover: ⬜ Cas's research focus ⬜ What's Your P(Doom)?™ ⬜ How high is the intelligence ceiling? ⬜ Idiocracy-inspired gradual disempowerment ⬜ Poor governance led to sycophancy, MechaHitler, nudification ⬜ Case study: Safeguards on DALL-E 2 vs. Stable Diffusion ⬜ The #PauseAI treaty option ⬜ Why Cas opposes alignment research This episode is a substantive AI doom debate with a brilliant guest. Highly recommended! 👇

0:50

4,513

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 10

Here's my PhD thesis defense from 5 weeks ago. This link exists, so I thought I might as well share. drive.google.com/file/d/1Zs9…

Cas_thesis_defense.mp4

drive.google.com

3,390

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 9

There are really interesting academic questions emerging around AI and epistemic risks. I only fear that, by the time we reach consensus, we will be too dumb to understand it.

Kellin Pelrine @KellinPelrine

Jun 9

Humanity's ability to know, reason, judge, and act well is the foundation of science, democracy, crisis response, & management of AI itself. AI poses serious risks to that foundation. New paper on epistemic risks by 30 experts calls for attention to this. Link in thread.

4,055

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 9

Anthropic and OpenAI are publicly pointing out how having the option to slow down AI would offer a potentially critical form of optionality in the future. The correct response for any policymaker should be "Damn, this is serious. How can I help build that capacity?"

108

3,982

Ron DeSantis

Cas (Stephen Casper) retweeted

Ron DeSantis

@RonDeSantis

Jun 8

I doubt Democrats will produce good policy re: AI, but Republicans have allowed them to capitalize on public concern about the power and influence of Big Tech by failing to adopt a sensible framework that will protect the public from the very real downsides of the technology. A policy that says transhumanists in Silicon Valley should be able to do what they want is not an acceptable approach, nor is it a politically viable approach.

126

245

1,051

84,226

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 8

According to the MIT Libraries' database of theses (dating back to the 1800s), my thesis was only the 2nd in the institute's history to contain the word "shit."

163

14,172

Rakshit Trivedi

Cas (Stephen Casper) retweeted

Rakshit Trivedi

@rstriv

Jun 4

As increasingly capable AI systems are deployed, humans, institutions, and other AI systems adapt in response — i.e. the world pushes back. So is capability still the central safety challenge for AI? We think not. We believe the harder challenge is coexistence. The current AI research paradigm treats the world as a stationary source of feedback, what we refer to as the solipsistic approach to AI design. This raises serious risks for coexistence. In our new #ICML2026 paper, we argue that superintelligence — an extremely capable task solver, built through such a solipsistic approach — is unlikely to be cooperative. 🧵

14,375

Mitt Romney

Cas (Stephen Casper) retweeted

Mitt Romney

@MittRomney

Jun 5

Our highest and most urgent national priority should be AI safeguards. The risks of AI weapons, pathogens, mass unemployment, surveillance, and even extinction must not continue to be largely ignored.

WSJ Tech

@WSJTech

Jun 4

Anthropic Urges Global Pause in AI Development, Flags ‘Self-Improvement’ Risk on.wsj.com/4o5IBpe

482

781

4,442

1,026,542

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 5

Sometimes I run into old papers or books that I can't believe weren't written about AI today.

2,349

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 4

Having worked in the space for a few years, I can definitively say that the most troublesome thing about working on open model safety has been the benchmarking. Glad tamper bench can help.

FAR.AI

@farairesearch

Jun 4

Open-weight LLMs ship with safety training that can be stripped in a few hundred fine-tuning steps. Can current defenses stop this? We built and open-sourced TamperBench, the first unified framework for evaluating tamper resistance, and the answer is mostly no. 1/7

2,606

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 3

Cool project! Should experts expect to be well-calibrated? No. But that's not the point. Tomorrow, if climate scientists said that they expect at 10% chance of catastrophic warming or military analysts said that there was a 10% chance of a nuclear bomb dropping this year, the right reaction is to worry, not be a pedant.

Dr. Peter Slattery @PeterSlattery1

Jun 3

📢 New paper: Prioritization of Risks from Artificial Intelligence: A Delphi Study of 272 International Experts AI creates many risks, from discrimination, privacy loss, and fraud to more emerging concerns such as overreliance, dangerous capabilities being misused in weapons or cyberattacks, and AI systems pursuing unintended goals. But which risks are most severe? Who is most vulnerable? And who is most responsible for addressing them? To answer these questions, we conducted a three-round expert consultation with 272 AI experts. 💡 Four insights from our findings: 1️⃣ If things continue as they are over the next 5 years, experts assigned ≥10% probability of catastrophic outcomes (e.g., >1 million deaths or >$100 billion in losses) to 18 of 24 risks. Top concerns: cyberattacks and weapons, dangerous AI capabilities, competitive dynamics, power centralization, and disinformation and influence at scale. 2️⃣ Even assuming pragmatic mitigations, 5 risks remained above the 10% catastrophic threshold: dangerous AI capabilities, cyberattacks and weapons, environmental harm, inequality, and power centralization. 3️⃣ Vulnerability is broadly distributed, but responsibility is concentrated. Experts assigned the highest vulnerability to AI users and the general public, while assigning primary responsibility for mitigation to frontier AI developers, governments, regulators, and standards bodies. 4️⃣ Information, finance, and national security were rated the sectors most vulnerable to AI risks. 🔗How can you engage? See our (fancy) new webpage for our interactive summaries of the findings and preprint, and please share with anyone working on AI risk, governance, or policy. airisk.mit.edu/priorities This research is part of the MIT AI Risk Initiative (@MITAIRisk), which aims to help society understand, prioritize, and manage risks from AI. The initiative includes the MIT AI Risk Repository, a living database of more than 1,700 AI risks, the AI Incident Tracker, a collaboration with the Responsible AI Collaborative, which connects risks to over 1,400 incidents, and the MIT AI Governance Map, which analyzes risk coverage across more than 1,000 laws, standards, policies, and other governance documents curated by the Center for Security and Emerging Technology (CSET). #AI #AIrisk #AISafety #AIGovernance #ResponsibleAI #RiskManagement

1,970

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 2

Dean Weinstein's leadership on AI and belief that it is today's most pressing governance challenge is one of the reasons why I am glad to join HKS and why I think it will be a unique source of academic leadership in AI governance. hks.harvard.edu/faculty-rese…

Dean Jeremy Weinstein encourages the Class of 2026 to revitalize the technology of democracy in an...

“This is a job for Kennedy School graduates”: At the HKS Diploma Ceremony, Weinstein spoke about the deeply human work of solving for the public good in a world transformed by AI.

hks.harvard.edu

1,579

Alex Bores

Cas (Stephen Casper) retweeted

Alex Bores

@AlexBores

Jun 2

.@berniesanders is right: the future of humanity belongs in the hands of our workers, not the handful of AI oligarchs pursuing unlimited wealth and power at the expense of our job, our kids, our economy, our climate, and our democracy. Here in NY, I delivered the strongest AI safety bill in the country. Now, I'm running for Congress to put Americans ahead of AI, and I look forward to partnering with Senator Sanders to get it done.

Sen. Bernie Sanders

@SenSanders

Jun 1

AI is built on humanity’s collective knowledge. The wealth it generates must benefit humanity — not just Elon Musk, Sam Altman and other AI oligarchs. That’s why I’ll be introducing the American AI Sovereign Wealth Fund Act — to give the public a direct ownership stake.

153

10,028

Sen. Bernie Sanders

Cas (Stephen Casper) retweeted

Sen. Bernie Sanders

@SenSanders

Jun 1

I'll be introducing the American AI Sovereign Wealth Fund Act, a bill giving the public a direct stake to determine AI's future. When a public resource generates wealth, the public should share in that wealth. x.com/i/broadcasts/1qxoNNPjL…

Sen. Bernie Sanders

LIVE: Who will own and control the future of AI?

412

607

2,354

131,763

Cas (Stephen Casper)

Cas (Stephen Casper)

@StephenLCasper

Jun 1

Just finished my PhD at @MITCSAIL. In July, I'll start as an assistant professor at the @Harvard @Kennedy_School. I have lots to learn and lots to do. With others (some TBA 👀) at HKS, I'm looking forward to helping academia offer guidance for governing the next chapters of AI.

806

51,290

Max Lamparth

Cas (Stephen Casper) retweeted

Max Lamparth @MLamparth

May 29

New paper: We identify a new class of reward hacking caused by mitigations, which we call reward bias substitution. We prove no standard benchmark detects it, even with oracle access to the true reward. We find it active in GRPO, in SOTA reward models, and published methods.

4,953