Joined July 2018
51 Photos and videos
We looked into the weeds of hindsight experience replay and came up with an efficient way of learning from all goals at once in off-policy goal-conditioned RL! It works well for reasonable numbers (100s--1000s) of non-exclusive sparse reward goals
Hindsight Experience Replay has become the ubiquitous method for goal-conditioned reinforcement learning, but leaves open the question of which goal to relabel with. In this work, accepted at ICML, we propose instead simply Learning Everything All at Once (LEO). 1/
1
2
14
2,237
and can possibly be adapted to bigger / continuous goal spaces It was great collaborating with the @FLAIR_Ox lab and @mitrma!
1
190
was not expecting this colab :o makes me wanna do another postdoc..
Replying to @mmuthukrishna
@mmuthukrishna and I are hiring a postdoc to join our labs at NYU!  We're looking for someone excited to work on one of society's newly emerging and potentially generation-shaping challenges: the multi-agent alignment problem.
2
453
can Claude self-report an injected emotion with neutral context? can it detect the mismatch between the emotion and the context (why am i feeling like this?) seems related to the introspection study: anthropic.com/research/intro… what other mental states could we do this with?
New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.
2
536
New paper at ICLR 2026! 🎉 "Language and Experience: A Computational Model of Social Learning in Complex Tasks" We model how humans combine advice from others with direct experience to learn new tasks, and show this enables bidirectional human-AI knowledge transfer. 🧵⤵️
1
25
164
10,028
Can knowledge accumulate across generations? We run iterated learning chains: each agent gets only 2 lives, then passes advice to the next Performance increases across generations: partial knowledge compounds through language, mirroring cultural evolution in human populations
1
6
359
w/ Tracey Mills, Ben Prytawski,@mhtessler, @noahdgoodman, @jacobandreas, Josh Tenenbaum Paper: arxiv.org/pdf/2509.00074 Play the games here: cedriccolas.com/demos/langua…

5
320
agreed! we will only care about open-ended systems that co-evolve goals and behaviors **within hybrid human-ai populations** this means ai goals will be influenced by ours (so ai math will remain math), but also means our goals will be influenced by ai’s too
Replying to @stephen_wolfram
We can automate the proving of theorems, or the discovery of conjectures, or even the invention of new axiom systems, but we can't automate *mathematics*. Because "mathematics" is the name we give to the *human* cultural story, not to the formal methods themselves. (14/15)
1
4
358
it’s not about replacing humans, but about adding new individuals in the cultural evolution process (of goals and behaviors), to catalyze it: more agents, more diversity of interests and skills -> more innovation, more creativity
1
1
169
this probably requires some kind of balancing so having more AIs than humans doesn’t bias goal evolution towards goals humans don’t care about maybe some kind of asymmetric social learning strategies: eg ai goals are rewarded when the resulting behaviors are helpful to humans
152
Cédric retweeted
Will the influx of synthetic data lead to uniform #ModelCollapse across the internet? Our recent #EMNLP2025 (Oral) paper suggests a nuanced picture: different collapse dynamics might emerge in different internet domains based on the properties of human data in those domains! 🧵
2
4
11
1,951
16 Dec 2025
this seems like important work! can we figure out mechanisms of collective intelligence? can we recreate these conditions everywhere?
for the past year i’ve been studying coordination. at @analoguegroup we’re building a living archive of scenius: how collective genius emerges secondrenaissance.now/ from bell labs to pixar to black mountain college, we trace the hidden architectures—labs, funding, norms, governance—that made extraordinary work possible u can even filter by vibe!
1
1
7
918
9 Dec 2025
Our self-improving genetic algorithm received the 2nd place paper award for the @arcprize! Congrats in particular to @PourcelJulien the experiments wizard! We proposed a simple, general algorithm ⬇️
8 Dec 2025
ARC Prize 2025 Winners Interviews Paper Award 2nd Place @PourcelJulien, @cedcolas, @pyoudeyer discuss SOAR - a self-improving evolutionary program synthesis framework that fine-tunes an LLM on its own search traces - without human-engineered DSLs or solution datasets.
1
3
20
1,039
9 Dec 2025
We are hiring a summer intern to work on extending this work, more complex search algorithms and more domains, and make the algorithm broadly accessible apply here: flowers.inria.fr/jobs/ other internships available too

1
2
164
9 Dec 2025
Finally, @PourcelJulien is looking for a summer internship! Topics of interests include open-endedness, test-time scaling, diversity generation, problem generation Hire him before someone else does! :) might be of interest to @Dahoas1 @jaseweston @robertarail
3
123