phd candidate @oiioxford @uniofoxford | research scientist @AISecurityInst | AI, social data science, persuasion with language models

Joined September 2018
42 Photos and videos
Pinned Tweet
🚨 New today in @ScienceMagazine !!🚨 We’re publishing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more… 🧵:
10
106
325
49,130
Kobi Hackenburg retweeted
Are AI Chatbots Harmful or Beneficial? It Depends What Would’ve Happened Otherwise When thinking about whether using AI chatbots is harmful or beneficial, we should always be asking: “Compared to what?” Link below 👇
1
2
3
242
Very excited to see this amazing work by @lujainmibrahim out today in @Nature :)
🚨Very excited to see our work on warmth & sycophancy in LLMs out in @Nature today!🚨 We study what happens when LLMs are fine-tuned to be warmer, and find that warmth and sycophancy can be linked, with warm models showing higher errors on a range of benchmarks (🔗s below)
1
9
1,844
Kobi Hackenburg retweeted
New paper w/ @AISecurityInst: AI writing assistance distorts how others perceive AI users and their opinions. Millions of people now use AI to help them write and communicate. In three large experiments (14k participants, 3m human ratings) we show that AI writing assistance systematically distorts writer personas – their perceived beliefs, personality, and identity. These distortions are consistent across AI models and persist even under realistic conditions of human oversight. 🧵
3
33
119
17,587
Very excited to see this out! We had a hunch that pervasive use of AI writing assistance for political opinion expression must be ~doing something~ to how those opinions are perceived in aggregate In large RCTs, we use a nifty within-subjects design to show exactly what :)
New paper w/ @AISecurityInst: AI writing assistance distorts how others perceive AI users and their opinions. Millions of people now use AI to help them write and communicate. In three large experiments (14k participants, 3m human ratings) we show that AI writing assistance systematically distorts writer personas – their perceived beliefs, personality, and identity. These distortions are consistent across AI models and persist even under realistic conditions of human oversight. 🧵
1
1
18
2,996
By distortion, we mean the difference in how third-party readers (blind to authorship) perceive a writer's own text vs. their AI-assisted text. Our design mimics the real world, where users can freely edit AI outputs and are free to *not use* AI-assisted outputs they don't like
1
1
137
In other words, we measure distortions between purely human-authored writing, and *human edited*, AI-assisted writing *which humans preferred to their own original writing* Has been great to work on this with @paul_rottger @hannahrosekirk @summerfieldlab. Feedback very welcome!
1
109
🚨 New today in @ScienceMagazine !!🚨 We’re publishing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more… 🧵:
10
106
325
49,130
I’m also very grateful to many more people @AISecurityInst for making this work possible! There will be lots more where this came from over the next few months 💪
1
2
741