AS

AS

232 Photos and videos

Tweets

Pinned Tweet

AS @agstrait

12 Nov 2024

Ayo - I'm finally off here. Keeping the account to retain my bookmarks, but no longer active. Join me here. bsky.app/profile/agstrait.bs…

519

AI Security Institute

AS retweeted

AI Security Institute

@AISecurityInst

May 13

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

125

589

137,418

AI Security Institute

AS retweeted

AI Security Institute

@AISecurityInst

Apr 24

We know AI systems occasionally act against their operators’ intentions – but what in their environment causes them to do so? In a new paper, we make progress on this question 🧵

104

13,677

Nate

AS retweeted

Nate

@NateBurnikell

Apr 23

We (@AISecurityInst) tested GPT-5.5 for its cyber capabilities and safeguards. It's the strongest performing model we've tested on our narrow cyber tasks and solved one of our cyber ranges in 1/10 attempts. We found a universal jailbreak with 6 hours of expert red teaming.

371

51,363

Jared Moore

AS retweeted

Jared Moore @jaredlcm

Mar 18

Disturbing anecdotal reports of "AI psychosis" and negative psychological effects have been emerging in the news. But what actually happens during these lengthy delusional "spirals"? In our preprint, we analyze chat logs from 19 users who experienced severe psychological harm🧵👇

400

53,095

Cas (Stephen Casper)

AS retweeted

Cas (Stephen Casper)

@StephenLCasper

4 Dec 2025

Did you know that one base model is responsible for 94% of model-tagged NSFW AI videos on CivitAI? This new paper studies how a small number of models power the non-consensual AI video deepfake ecosystem and why their developers could have predicted and mitigated this.

8,008

AS

AS @agstrait

3 Dec 2025

🛠️ This is a technical role for an applied ML or security engineer. The work we anticipate could include building scalable ways to detect malicious LoRAs, exploring data filtering and other methods for reducing malicious fine-tuning, and other technical methods.

AS

AS @agstrait

3 Dec 2025

🤝 You’ll work with 2 other researchers and in collaboration with other gov departments. The first project is to create a problem book of methods to reduce these risks (building on papers.ssrn.com/sol3/papers.…). Crucially, you are not expected to view sensitive material directly.

summerfieldlab @summerfieldlab.bsky.social

AS retweeted

summerfieldlab @summerfieldlab.bsky.social @summerfieldlab

9 Jul 2025

In a new paper, we examine recent claims that AI systems have been observed ‘scheming’, or making strategic attempts to mislead humans. We argue that to test these claims properly, more rigorous methods are needed.

17,204

Saffron Huang

AS retweeted

Saffron Huang

@saffronhuang

10 Jun 2025

Newest @reboot_hq 🎙️post: @jessicadai_ and I discuss forecasting, and how people present unhelpful narratives about the future (mostly by picking on AI 2027, sorry guys) Why we should view the future as constructed, not predicted

4,133

Josh Wolfe

AS retweeted

Josh Wolfe

@wolfejosh

7 Jun 2025

Apple just GaryMarcus'd LLM reasoning ability

217

560

4,916

3,494,064

AI Security Institute

AS retweeted

AI Security Institute

@AISecurityInst

12 May 2025

Advanced AI systems require complex evaluations to measure abilities, but conventional analysis techniques often fall short. Introducing HiBayES: a flexible, robust statistical modelling framework that accounts for the nuances & hierarchical structure of advanced evaluations.

ALT An example of hierarchically nested evaluation data.

7,273

Sayash Kapoor

AS retweeted

Sayash Kapoor @sayashk

15 Apr 2025

How will AI impact the economy? Can we defend against misuse? What policies would mitigate the risks of AI? Thrilled to share that @random_walker and I are writing another book to tackle these questions! Today, we release a paper laying out our argument: AI as Normal Technology.

Image of the first page of the paper, available at https://kfai-documents.s3.amazonaws.com/documents/c3cac5a2a7/AI-as-Normal-Technology---Narayanan---Kapoor.pdf

ALT Image of the first page of the paper, available at https://kfai-documents.s3.amazonaws.com/documents/c3cac5a2a7/AI-as-Normal-Technology---Narayanan---Kapoor.pdf

282

58,411

AS

AS @agstrait

7 Apr 2025

I too find this really weird, mainly in that it shows the frontier of AI research is at risk of moving further away from producing useful, safe, reliable products. These seem like features, not bugs.

Billy Perrigo @billyperrigo

7 Apr 2025

nice analogy from @jackclarkSF newsletter this week

262

AI Security Institute

AS retweeted

AI Security Institute

@AISecurityInst

3 Apr 2025

We've funded 20 new research projects to enhance AI security in critical infrastructure ⚡ Our Systemic AI Safety Grants Programme, announced at the Seoul AI Summit, has awarded up to £200,000 seed grants to projects tackling AI risks 🧵👇

9,537

AS

AS @agstrait

25 Mar 2025

A great thread re: problematic extrapolations on claims about AI being superhuman at tasks. 1. Coding =/= all computer-related tasks, let alone all tasks 2. Generating code to complete a task =/= the most efficient, secure way to complete a task.

Natália 🔍

@natalia__coelho

24 Mar 2025

This tweet is misleading. State-of-the-art AI models struggle at some tasks that take humans <10 minutes, while *simultaneously* excelling at some tasks that would take humans several hours or days to solve.

299

AS

AS @agstrait

20 Mar 2025

I'll do a longer post about the new role, but in short, we're building a research team, funding programme, and partnerships to tackle crucial questions about advanced AI's societal impacts. We'll track how AI is being used across critical sectors, and study societal-level risks.

AS

AS @agstrait

20 Mar 2025

These include undesirable automation, over-reliance on AI systems, mental health impacts, mass generation of unreliable content, power concentration, and social destabilisation...and so much more.

AS

AS @agstrait

20 Mar 2025

I'm so proud of Ada's accomplishments - from pioneering work on COVID-tech, deep dives into AI auditing and research ethics, supporting emerging AI policy and regulation, exploring risks of AI & genomic systems, foundation models and personal AI assistants...it's a very long list

AS

AS @agstrait

20 Mar 2025

But I'm even more proud of the people we've worked with, the community we've fostered, and the culture we've built. I remember the first meeting I had with @carlykind_ about her vision for the org - I was all in. Looking back, we've accomplished so much more than we'd thought.