Patrick Butlin

Patrick Butlin

1 Photos and videos

Tweets

Patrick Butlin @patrickbutlin

Jun 3

MATS with @RosieCampbell will also be awesome!

Rosie Campbell

@RosieCampbell

Jun 2

I am going to be mentoring for a new MATS track focused on founders and amplifiers! Many fellowships focus on research, but there's so much to be done beyond that. Come found orgs, build infra, run events, and help us scale up the field of AI welfare. Apply by June 7 matsprogram.org/apply

741

Patrick Butlin

Patrick Butlin @patrickbutlin

Jun 3

Apply for MATS with @dillonplunkett - it'll be awesome!

Dillon Plunkett

@dillonplunkett

Jun 3

I’m mentoring Autumn 2026 @MATSprogram Fellows interested in doing AI welfare research. The application deadline is this Sunday (6/7). More info in this thread:

1,545

Dillon Plunkett

Patrick Butlin retweeted

Dillon Plunkett

@dillonplunkett

Jun 1

We're hiring Research Scientists to join my team at @eleosai! We do foundational and applied ML research on the moral status and potential well-being of AI systems. This is urgent, important work, and Eleos is an extraordinarily fun and exciting place to do it. Details below.

ALT Screenshot of Eleos AI Research Scientist job description

239

22,147

Zach Freitas-Groff 🔸

Patrick Butlin retweeted

Zach Freitas-Groff 🔸@zdgroff

Jun 1

💡 Another round of Longview Philanthropy’s digital minds request for proposals is open for applications. A year ago I would have called this niche. Now AI labs publish model welfare research, public discussion of digital sentience is growing, and the field is expanding. 📈

8,048

Patrick Butlin

Patrick Butlin @patrickbutlin

May 18

Another exciting @MATSprogram paper, this time from the brilliant @gilg_oscar. We found a direction in LLMs that apparently performs a persona-relative evaluative function in some very different contexts.

Oscar Gilg @gilg_oscar

May 18

First preprint! Working with @patrickbutlin during @MATSprogram. LLM Assistant personas like being helpful, evil personas like being harmful. We found that a single direction represents helping as good under the Assistant, and ‘harm’ as good under evil.

2,684

more replies

Patrick Butlin

Patrick Butlin @patrickbutlin

May 18

Our research is complementary with Anthropic's concurrent work on emotion concepts (transformer-circuits.pub/202…); we used a different method to extract evaluative representations and studied how they interact with varying personas.

120

Patrick Butlin

Patrick Butlin @patrickbutlin

May 18

Link to the paper: arxiv.org/abs/2605.13339

Probing Persona-Dependent Preferences in Language Models

Large language models (LLMs) can be said to have preferences: they reliably pick certain tasks and outputs over others, and preferences shaped by post-training and system prompts appear to shape...

arxiv.org

Patrick Butlin

Patrick Butlin @patrickbutlin

Apr 20

I'm proud to announce this new paper with my fantastic @MATSprogram fellow @BeckmannPierre, on personas and LLM individuation.

Pierre Beckmann @BeckmannPierre

Apr 20

New paper with @PatrickButlin, from my time at @MATSprogram . We propose two new candidates for LLM individuation: the (virtual) instance-persona view and the model-persona view. 🧵

5,956

more replies

Patrick Butlin

Patrick Butlin @patrickbutlin

Apr 20

Many thanks to @MATSprogram for making our collaboration possible - and look out for another paper, with the equally excellent @gilg_oscar, coming soon!

266

Patrick Butlin

Patrick Butlin @patrickbutlin

Apr 20

link here: philpapers.org/archive/BECWI…

179

Patrick Butlin

Patrick Butlin @patrickbutlin

Mar 3

Some recent papers:

702

Patrick Butlin

Patrick Butlin @patrickbutlin

Mar 3

1. 'Desire in AI': philarchive.org/rec/BUTDIA 2. 'Are any machines conscious today?': philarchive.org/rec/BUTAAM-2 3. 'Testing for consciousness in current AI': philarchive.org/rec/BUTTFC 4. 'Consciousness and AI' encyclopaedia entry: oecs.mit.edu/pub/zf1nbs6d/

Patrick Butlin, Desire in AI - PhilArchive

philarchive.org

368

Patrick Butlin

Patrick Butlin @patrickbutlin

Mar 3

5. 'Higher-order representation in AI' (unfortunately slightly dated already): philosophymindscience.org/in…

232

Patrick Butlin

Patrick Butlin @patrickbutlin

11 Nov 2025

New paper on AI consciousness! Here we present the theory-derived indicator method for assessing AI systems for consciousness. Link below.

330

28,943

more replies

Patrick Butlin

Patrick Butlin @patrickbutlin

11 Nov 2025

The new paper is here: sciencedirect.com/science/ar…

1,530

Patrick Butlin

Patrick Butlin @patrickbutlin

11 Nov 2025

Many thanks to the editor and reviewers for @TrendsCognSci and especially to my co-authors, including @rgblong @Yoshua_Bengio @birchlse @davidchalmers42 @ConstantAxel @georgejwdeane @EricElmoznino @kanair @MatthiasMichel_ @Liad_Mudrik @meganakpeters @eschwitz and others!

1,440

Eleos AI Research

Patrick Butlin retweeted

Eleos AI Research @eleosai

4 Sep 2025

We're thrilled to announce the first Eleos Conference on AI Consciousness and Welfare. Join us Nov 21-23, 2025 in Berkeley, CA for discussions on AI welfare with leading researchers from @nyuniversity, @Google, @AnthropicAI, & more.

110

27,412