Joined February 2008
57 Photos and videos
We just posted a blog paper on a a simple but effective approach to model honesty called "Confessions" TL; DR: normal RL training rewards for high performance on a task. Confession training is a separate phase that rewards only for honesty. Test look promising! More:
3 Dec 2025
In a new proof-of-concept study, we’ve trained a GPT-5 Thinking variant to admit whether the model followed instructions. This “confessions” method surfaces hidden failures—guessing, shortcuts, rule-breaking—even when the final answer looks correct. openai.com/index/how-confess…
1
2
20
1,172
This was a fun project (that I jumped into halfway through) with @ManasJoglekar, Jeremy Chen, @GabrielDWu1, @j_asminewang, @boazbaraktcs, and @mia_glaese. Boaz wrote a good casual summary: x.com/boazbaraktcs/status/19…

1/5 Excited to announce our paper on confessions! We train models to honestly report whether they “hacked”, “cut corners”, “sandbagged” or otherwise deviated from the letter or spirit of their instructions. @ManasJoglekar Jeremy Chen @GabrielDWu1 @jasonyo @j_asminewang @mia_glaese
1
3
10
1,628

OpenAI has trained its LLM to confess to bad behavior trib.al/ur62e6F
2
6
1,438
Jason Yosinski retweeted
Today, OpenAI is launching a new Alignment Research blog: a space for publishing more of our work on alignment and safety more frequently, and for a technical audience. alignment.openai.com

41
136
1,193
468,222
Jason Yosinski retweeted
A little while ago, many of you gave generously to support a number of MLC-Nigeria researchers in attending Deep Learning Indaba #DLI2025. Here's the crew 👇that attended; from what we hear it was a bustle of talks, posters, mentorship, and sparks of collaboration!
22 Aug 2025
Today we say goodbye to @DeepIndaba after six inspiring days in Kigali rich with keynotes, tutorials, workshops, mentorship circles, and insightful posters that kept us learning non-stop. Some of us were only able to make it down to #DLI2025 because of your generous support.
2
15
68
5,028
Jason Yosinski retweeted
22 Aug 2025
Today we say goodbye to @DeepIndaba after six inspiring days in Kigali rich with keynotes, tutorials, workshops, mentorship circles, and insightful posters that kept us learning non-stop. Some of us were only able to make it down to #DLI2025 because of your generous support.
10 Jul 2025
The opportunity gap in AI is more striking than ever. We talk way too much about those receiving $100M or whatever for their jobs, but not enough those asking for <$1k to present their work. For 3rd year in a row, @ml_collective is raising funds to support @DeepIndaba attendees.
1
18
93
11,548
Help send a bunch of researchers to DL Indaba this year! For less than one H100 we can send 25 people!
10 Jul 2025
The opportunity gap in AI is more striking than ever. We talk way too much about those receiving $100M or whatever for their jobs, but not enough those asking for <$1k to present their work. For 3rd year in a row, @ml_collective is raising funds to support @DeepIndaba attendees.
20
37
4,535
Jason Yosinski retweeted
Starting in 1 hour: @thebasepoint presents Anthropic's "Biology of a Large Language Model" work at the DLCT reading group. Paper: transformer-circuits.pub/202… Come for the chain of thought, stay for the rabbits and habbits. Zoom info below 👇
3
4
13
1,980
Starting in 30 min!
Next Research Jam is in 14 hours, tomorrow morning at 8am PT. Stop by this virtual lab meeting to hear research ideas and updates on projects in progress! Zoom info at mlcollective.org/events/rese…
1
4
1,288
Next MLC Research Jam is tomorrow; sharing two ideas myself to mix things up :)
Next Research Jam is in 14 hours, tomorrow morning at 8am PT. Stop by this virtual lab meeting to hear research ideas and updates on projects in progress! Zoom info at mlcollective.org/events/rese…
1
1
6
3,105
Starting in 15 min!
This week at Deep Learning: Classics and Trends we're kicking off a new five part mini-series on LLM Interpretability. Up first: @thesubhashk shows how LLMs represent numbers on a helix and use it to add! Join Friday at 10am PT, zoom here: mlcollective.org/dlct/
1
4
1,362
I am sitting here watching my HF smolagent slowly reason about and click on Captcha squares one a time 🙈. Is this general AI?
1
6
713
Jason Yosinski retweeted
Tomorrow at 10am PT we'll have our next MLC OpenClubHouse, our 25th 🎉! Stop by to hang out, catch up with friends, and chat about ML or anything else. We'll meet in the MLC Discord #openclubhouse channel: discord.gg/6Za9MBr4?event=13…
2
6
1,194
Jason Yosinski retweeted
If you're at #ICLR2025, stop by the ML Collective Picnic Lunch on Monday at 12:30, graciously hosted by Alex Bezzubov! All welcome, bring your own lunch and meet new research friends and collaborators. 🥰 lu.ma/ioghynfh
1
10
1,380
Jason Yosinski retweeted
Rajat Modi presenting his work right now on getting Glom to work, poster #6304. Asynchronous Perception Machine for Efficient Test Time Training Rajat Modi · Yogesh Rawat West Ballroom A-D #6304
2
12
1,437
Jason Yosinski retweeted
✨Our new @unireps paper tries to answer why the Lottery Ticket Hypothesis (LTH) fails to work for different random inits through the lens of weight-space symmetry. We improve the transferability of LTH masks to new random inits leveraging weight symmetries. 🧵(1/6)
7
24
84
14,933
Jason Yosinski retweeted
And...there we go! TL;DR: we are launching a new event series called "Industry Round Tables" with its first instance on Thursday, August 22! Register here if interested: lu.ma/7xfygmxg
Get ready for our first LinkedIn post!
1
4
18
2,971
Jason Yosinski retweeted
24 Jul 2024
We speed up renewable energy site selection & due diligence, reducing months of work to minutes. With $11M in new funding led by @navitascapital, we'll improve our software tools to allow quick, informed decisions that accelerate the energy transition. paces.com/news/paces-raises-…
2
6
13
2,023
I had a pretty fun conversation with @JonKrohnLearns the other day on startups, wind energy, the electrical grid in the US, ML, and (of course) how neural networks really work :)
One of my all-time favorite A.I. researchers, Dr. Jason Yosinski (@jasonyo), is my guest today! He details how his startup is using ML to collect wind energy more efficiently and digs into visualizing/understanding deep neural networks. Watch here: superdatascience.com/789 Jason: • Is Co-Founder and CEO of @Windscape_AI, a startup using ML to increase the efficiency of energy generation via wind turbines. • Is Co-Founder and President of the ML Collective, a research group that’s open to ML researchers anywhere. • Was a Co-Founder of the A.I. Lab at the ride-share company Uber. • Holds a PhD in Computer Science from Cornell, during which he worked at the NASA Jet Propulsion Laboratory, Google DeepMind and with the eminent Yoshua Bengio in Montreal. • His work has been featured in The Economist, on the BBC and, coolest of all, in an XKCD comic! Today’s episode gets fairly technical in parts so may be of greatest interest to hands-on practitioners like data scientists and ML engineers, although there are also parts that will appeal to anyone keen to hear how ML is being used to produce more clean energy. In today’s episode, Jason details: • How ML can make wind direction more predictable, thereby making wind turbines and power grids in general more efficient. • How to infer what individual neurons in a deep learning model are doing by using visualizations. • Why freezing a particular layer of a neural net prior to doing any training at all can lead to better results. • How you can get involved in a cutting-edge research community no matter where you are in the world. • What traits make for successful A.I. entrepreneurs. Many thanks to @Crawlbase for supporting this episode of Super Data Science, enabling the show to be freely available on all major podcasting platforms as well as the video version we publish on YouTube. This is Episode #789! #superdatascience #machinelearning #ai #climatechange #windenergy
1
2
14
2,911
Jason Yosinski retweeted
30 Apr 2024
Canadians, if you‘re considering switching to a heat pump to heat and cool your home, and you’re curious about utility bill costs, pay back period, and reduction in greenhouse gas emissions, I made a thing for you. Link 👇
1
5
33
5,743