next gen computer-use agents @salesforce | ex @convergence_ai_ @UnitaryAI @CompSciOxford @imperialcollege | interested in steering ai to do more good than bad

Joined August 2018
60 Photos and videos
Pinned Tweet
11 Nov 2020
Excited to release code and models for detoxify 🙊, a simple python library built @UnitaryAI that aims to detect hate speech and toxic comments online! Includes training scripts built with @PyTorchLightnin ⚡ and transformers checkpoints @huggingface 🤗! github.com/unitaryai/detoxif…
13
69
438
Laura Hanu retweeted
Watch me control my computer with just my voice. This is the future of operating systems. No hands. GPT-Realtime 2.0 is very, very underrated. Demo:
932
840
14,126
3,709,975
pretty cool room of people at the codex event tonight @OpenAI
1
5
311
say what you will about ai slop but it’s so cool that you can just generate a 4 page comic about the history of epistemology these are with @OpenAI images 2.0
2
69
this was a balanced, fun and surprisingly touching doc, @DanielRoher did well capturing the ai zeitgeist and its defining promise/peril tension in a very human way, kudos to @Fremond_ @ry_paddy for getting their hands on it & organising the watch party
3
118
Laura Hanu retweeted
claude cowork is making me think maybe we’ll look back and it’ll be obvious that humans were never meant to spend their lives working behind a screen. we’ll see it as inevitable that computers do everything for us on computers and the future of work is cooler than we can imagine
331
578
9,287
341,991
30 Oct 2025
main takeaways from the @dwarkeshpodcast @karpathy interview: *RL limitations*: hard to get past reward sparsity problem in RL when it comes to real world tasks, one promising direction could be more sample efficient learning by reflecting on mistakes *reduced model size by getting rid of redundant memorisation*: soon the cognitive engine might look like a 1 billion parameter model that just knows how to think without the need to memorise lots of data - it should be able to recognise what it doesn't know and look it up *gradual loss of control and understanding*: even if we still have humans delegating tasks to autonomous entities, it will get increasingly harder to fully control them, let alone understand what they're doing; similar thoughts in this great albeit on the pessimistic side paper arxiv.org/abs/2501.16946 *role of AI education post-AGI*: pre-AGI AI education is useful, post-AGI education is fun, could be seen as a way to train mentally just how you do physically
124
30 Aug 2025
pantheon is now up there with some of my favourite sci fi of all time like asimov’s the end of eternity, the last question or story of your life with some elements of snow crash or dark the tv show but upping the scale even more, dare i say something akin to what @DavidDeutschOxf was describing in the beginning of infinity
224
29 Aug 2025
a nice straight forward summary on some of the grpo limitations beyond just not being great for multi-turn e.g. * if you have multiple reward signals -> the model won't know which one it is being rewarded for since they're usually all collapsed into one * only the scalar reward is used for policy update when a more detailed textual feedback could be used (what gepa kinda does with their reflective prompt evolution)
28 Aug 2025
wrote a short blogpost on what I think are some limitations of GRPO: I’ve been playing around with RL finetuning for reasoning tasks and came across a few limitations that i wanted to document here feedback/corrections are welcome!
198
29 Aug 2025
nice watch for a reading group on verifiers, like a jigsaw falling into place🤭
Video summary for "Prover-Verifier Games improve legibility of LLM outputs" youtu.be/EMDa4urzz-M 1/2
1
198
Why does RL work so well to learn preferred completions in llm post training and why can't we just use supervised finetuning? This paper has an interesting information-theoretic explanation: arxiv.org/pdf/2503.01067 TLDR: The literature shows that the 2 stage RL approach (1. learn a good reward model that can score llm completions similar to how a human would 2. use that to learn good policies with RL) used today in the sota models outperforms using only supervised finetuning. The authors posit that this happens when the verification of an output is simpler than generating it. This is because, in the RL case, the search space is constrained to the subset of policies that are optimal for the learnt reward model. When they reduce the verification-generation gap empirically the difference in results diminishes too.

1
175
It's been interesting to see RL having a comeback lately. Guess in retrospect it's not surprising that RLHF is not enough since it naturally hits a ceiling limited by the quality of human feedback. What's particularly exciting though is that this shift is happening just as AI is becoming more agentic and able to interact with the digital world. Combine this with some grounded reward signals (e.g. profit, likes, citations) and we should see AI not just replicate what humans do but create new knowledge through their own experiences. The next few years should be a fun and wild ride 🎢 For another interesting paper on the topic: incompleteideas.net/papers/T… or this podcast on it: youtu.be/zzXyPGEtseI?si=5VL-…

87
30 Apr 2025
the fact that this could be the prequel to Adolescence 😳
30 Apr 2025
Well that's fucking terrifying...
122
note to self, don’t start addictive puzzles late at night, but also level 4 lets go?🕵️‍♀️ fairly consistent strategy so far 👀
3 Feb 2025
We challenge you to break our new jailbreaking defense! There are 8 levels. Can you find a single jailbreak to beat them all? claude.ai/constitutional-cla…
2
254
28 Dec 2024
great discussion! we need more public discourse between top ai leaders and economists 🤝 would’ve liked to see them engage with hinton’s concern that the benefits ai brings will deepen the economic divide in a capitalist system further creating fertile ground for fascism 🌶️
Watch our 2024 Nobel Prize laureates talk about their research and careers in a unique roundtable discussion, 'Nobel Minds', moderated by BBC's Zeinab Badawi. youtu.be/1tELlYbO_U8
2
284
11 Dec 2024
at @NeurIPSConf this week 👋
2
146
Laura Hanu retweeted
26 Jun 2024
Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.🧵[1/n]
17
246
1,110
333,194
The 2nd part of the Intro to multi-node machine learning is out! 🥳 This all about how to use Slurm to scale up your ML applications! 🚀 unitary.ai/articles/intro-to…
1
4
9
1,074
24 Oct 2023
Excited to kick off a new deep-dive blog series on how to build and set up the infrastructure for distributed training from scratch. Spoiler alert – setting up a cloud-based cluster that can scale to hundreds of nodes isn't as daunting as it sounds! 👀🚀 unitary.ai/articles/intro-to…
2
6
1,157