Cozmin Ududec

Cozmin Ududec

15 Photos and videos

Tweets

Magda Dubois retweeted

Cozmin Ududec

@CUdudec

Feb 26

New from the Science of Evaluation Team at @AISafetyInst: a pipeline for rigorous transcript analysis. I think transcript analysis is still underrated, especially as model horizons are getting longer and task environments more complex.

1,297

Arvindh Arun

Magda Dubois retweeted

Arvindh Arun

@arvindh__a

12 Sep 2025

Why does horizon length grow exponentially as shown in the METR plot? Our new paper investigates this by isolating the execution capabilities of LLMs. Here's why you shouldn't be fooled by slowing progress on typical short-task benchmarks... 🧵

268

56,498

Konrad Rieck 🌈

Magda Dubois retweeted

Konrad Rieck 🌈@mlsec

1 Jul 2025

We're excited to announce the Call for Papers for SaTML 2026, the premier conference on secure and trustworthy machine learning @satml_conf We seek papers on secure, private, and fair learning algorithms and systems. 👉 satml.org/call-for-papers/ ⏰ Deadline: Sept 24

5,737

Sahar Abdelnabi 🕊

Magda Dubois retweeted

Sahar Abdelnabi 🕊

@sahar_abdelnabi

1 Jun 2025

Hawthorne effect describes how study participants modify their behavior if they know they are being observed In our paper 📢, we study if LLMs exhibit analogous patterns🧠 Spoiler: they do⚠️ 🧵1/n

125

24,739

summerfieldlab @summerfieldlab.bsky.social

Magda Dubois retweeted

summerfieldlab @summerfieldlab.bsky.social @summerfieldlab

9 Jul 2025

In a new paper, we examine recent claims that AI systems have been observed ‘scheming’, or making strategic attempts to mislead humans. We argue that to test these claims properly, more rigorous methods are needed.

17,201

AI Security Institute

Magda Dubois retweeted

AI Security Institute

@AISecurityInst

9 Jul 2025

Evaluating AI models is essential for improving their performance and understanding their risks. Increasingly, researchers are using “autograders” – having Large Language Models (LLMs) grade model outputs. But how do we know if these autograders are reliable? 🧵

5,378

Magda Dubois

Magda Dubois @DubMagda

13 May 2025

New paper introducing a framework to better quantify uncertainty in LLM evaluations (led by @LLuettgau🙌). A beta Python package (developed by @HarryCoppock🚀) is available if you want to try it out. ➡️Get in touch if you have any Qs/feedback! Paper: arxiv.org/abs/2505.05602

AI Security Institute

@AISecurityInst

12 May 2025

Advanced AI systems require complex evaluations to measure abilities, but conventional analysis techniques often fall short. Introducing HiBayES: a flexible, robust statistical modelling framework that accounts for the nuances & hierarchical structure of advanced evaluations.

ALT An example of hierarchically nested evaluation data.

149

AI Security Institute

Magda Dubois retweeted

AI Security Institute

@AISecurityInst

6 May 2025

🧵 Today we’re publishing our first Research Agenda – a detailed outline of the most urgent questions we’re working to answer as AI capabilities grow. It’s our roadmap for tackling the hardest technical challenges in AI security.

AISI Research Agenda

aisi.gov.uk

121

29,345

Lennart Luettgau

Magda Dubois retweeted

Lennart Luettgau @LLuettgau

23 Sep 2024

Excited to share our brand-new work shedding some light on the neural mechanisms behind one of human’s coolest cognitive feats: compositional generalization of structural knowledge! A Tweeprint-Thread 🧵 1/n

4,046

Alexandr Wang

Magda Dubois retweeted

Alexandr Wang

@alexandr_wang

25 Jul 2024

1/ New paper in Nature shows model collapse as successive model generations models are recursively trained on synthetic data. This is an important result. While many researchers today view synthetic data as AI philosopher’s stone, there is no free lunch. Read more 👇

661

272,240

Felix Busch

Magda Dubois retweeted

Felix Busch @Fel_Busch

13 Aug 2024

I am excited to share that our article *Navigating the European Union Artificial Intelligence Act for Healthcare* has just been published in @npjDigitalMed🚀 #AIRegulation #DigitalHealth #EUAIAct #MedicalDevices #Innovation #npjDigitalMedicine #AIinHealthcare

ALT https://www.nature.com/articles/s41746-024-01213-6

1,968

Matthew Nour

Magda Dubois retweeted

Matthew Nour @Matt_Nour

10 Oct 2023

Paper out in @PNASNews! A 'cognitive mapping' lens on language in psychosis, using word embedding models, computational modelling, and MEG. A hint of what's to come at @OxPsychiatry and @UCLBrainScience... With @mcneural_, @YunzheNeuro, Ray Dolan. pnas.org/doi/abs/10.1073/pna…

Trajectories through semantic spaces in schizophrenia and the relationship to ripple bursts | PNAS

Human cognition is underpinned by structured internal representations that encode relationships between entities in the world (cognitive maps). Cli...

pnas.org

122

19,968

Lennart Luettgau

Magda Dubois retweeted

Lennart Luettgau @LLuettgau

1 Sep 2023

Preprint alert🚨! In this new paper we study how humans decompose dynamical subprocesses and leverage the abstracted subprocesses for compositional reuse of experience in new situations. psyarxiv.com/sxn4a/ Tweeprint to follow soon!

10,546

Marcelo Mattar

Magda Dubois retweeted

Marcelo Mattar @marcelomattar

4 May 2023

In our lab's latest paper, we introduce a novel modeling approach using RNNs to reveal the cognitive algorithms behind animal decision-making. Check out our preprint, led by UCSD PhD student @Ji_An_Li and co-authored by Marcus Benna: biorxiv.org/content/10.1101/…

Automatic Discovery of Cognitive Strategies with Tiny Recurrent Neural Networks

Normative modeling frameworks such as Bayesian inference and reward-based learning provide valuable insights into the fundamental principles of adaptive behavior. However, their ability to describe...

biorxiv.org

19,945

Magda Dubois

Magda Dubois @DubMagda

6 Apr 2023

Congratulations to my academic sibling @AlisaLoosen for those (very) well-deserved three shiny balloons

2,456

Magda Dubois

Magda Dubois @DubMagda

14 Mar 2023

Wanna try out a (cool🦙) alternative to GPT?

Yann Dubois

@yanndubs

13 Mar 2023

🦙Excited to share this demo of Alpaca 🔥Highlights: ~GPT3.5 performance for < 600$🔥 The goal was to have a simple model /training procedure that academics could study and improve with limited resources We achieved that by finetuning a 7B LLaMA on 52K generated instructions

287

Magda Dubois

Magda Dubois @DubMagda

21 Dec 2022

Postdoc position in Boston ⭐️ Great place and amazing person to work with !

This tweet is unavailable

667

Tobias Hauser

Magda Dubois retweeted

Tobias Hauser @TobiasUHauser

2 Dec 2022

A while ago we published this #RegisteredReport in @NatureComms - but was this format of pre-registration really useful? Find some answers in this Q&A with us and one of the reviewers: nature.com/articles/s41467-0…

A conversation on bringing the Registered Report format to Nature Communications

Nature Communications - We recently published our first Registered Report entitled ‘Value-free random exploration is linked to impulsivity’. We believe the format offers many benefits...

nature.com

Magda Dubois @DubMagda

5 Aug 2022

Our #RegisteredReport with @TobiasUHauser is now out in @NatureComms 🤓 We asked how people differ in their exploration - and found that impulsive and anxious subjects explore using different exploration strategies ! 1/ nature.com/articles/s41467-0…

Alex Hopkins

Magda Dubois retweeted

Alex Hopkins @alexKhopkins

6 Oct 2022

New preprint ! 📢📢 Very happy to share some recent work looking at the holy trinity of transdiagnostic symptom dimensions (anxious-depression, compulsivity and social withdrawal) and how we can optimise their measurement. psyarxiv.com/q83sh Some key points below… 1/n

In Silico (a documentary film)

Magda Dubois retweeted

In Silico (a documentary film)@In_Silico_Film

15 Sep 2022

IN SILICO is now available for streaming internationally via Vimeo On Demand: vimeo.com/ondemand/insilico2 It's a story told over 10 years about an attempt to simulate a brain on supercomputers. Now for more about the movie's plot and its subjects, via GIFs... 🧵👇