Joined February 2018
7 Photos and videos
Ryan Panwar retweeted
Super excited about this work! This paper was driven by a claim I've been making to anyone who'll listen: "interpretability is the language of data". (1/3)
Have you debugged your training data? You might not like what you find. Introducing predictive data debugging: reveal and shape what your model will learn before training. In DPO datasets, we found broken guardrails, hallucinations, and fish fart fan fiction (seriously). (1/9)
2
21
138
12,356
There’s a lot of weird and surprising correlations in large scale datasets, this research helps unravel them to make posttraining less of a black magic
Have you debugged your training data? You might not like what you find. Introducing predictive data debugging: reveal and shape what your model will learn before training. In DPO datasets, we found broken guardrails, hallucinations, and fish fart fan fiction (seriously). (1/9)
11
289
Ryan Panwar retweeted
Neural networks do math by rotating shapes. We found a shape-rotating calculator hidden inside an LLM – and it’s used for more than just math! (1/6)
Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵
122
555
4,302
935,779
This work is extremely cool, LLMs have learned from raw text data an incredibly elegant geometrical algorithm and applied it in multiple different settings
Replying to @GoodfireAI
The same calculator handles a wide range of tasks, including: - arithmetic (“7 9”) - weekdays (“nine days after Friday”) - months (“six months after August”) Llama built this mechanism from scratch in training, and uses it with striking elegance and flexibility. (4/6)
19
1,570
Ryan Panwar retweeted
Excited that Silico, the core platform behind our research results, is finally being announced! Here's one example of how I've used it: To interpret the EchoJEPA model and visualize different attribution methods onto a temporally aligned 3D heart mesh. More results to share on this soon!
Introducing Silico: the platform for building AI models with the precision of written software. Silico lets researchers and engineers see inside their models, debug failures, and intentionally design them from the ground up. Early access is open now. 🧵(1/10)
3
15
1,344
Silico is a lab for AI scientists to dissect their own brains. It already runs autonomously for days, performing frontier AI research finding and fixing model pathologies.
Introducing Silico: the platform for building AI models with the precision of written software. Silico lets researchers and engineers see inside their models, debug failures, and intentionally design them from the ground up. Early access is open now. 🧵(1/10)
1
11
594
Ryan Panwar retweeted
if you have goblins in your model, silico will find them or your money back
Introducing Silico: the platform for building AI models with the precision of written software. Silico lets researchers and engineers see inside their models, debug failures, and intentionally design them from the ground up. Early access is open now. 🧵(1/10)
2
11
84
5,002
Ryan Panwar retweeted
loved the piece on interpretability in the @nytimes this morning! the field has accomplished some pretty cool things in recent years. still, there is much work to do as @davidbau put it elegantly, interpretability is now where biology was in 1930: “The cell was a black box for biologists. They were slow to get off the starting block to start studying heredity. But once they did, the problem fell.” there is so much we can learn from these alien intelligences.
.@nytimes this morning
4
13
77
7,301
Ryan Panwar retweeted
At my last job, we often got calls from parents frantically asking for their child's genetic test results. Too often, the results were inconclusive. Variant effect prediction sounds abstract but can be life-or-death for genetic disorders. Proud of the team for narrowing this gap!
We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic. We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8)
3
5
50
6,829
Ryan Panwar retweeted
there is much that we can learn from these alien intelligences. i'm excited to see what the community can do with the tools we are open sourcing interpretability is at the center of our success here. not only do these explanations offer potential discoveries, understanding the model was critical to the entire research path here. very proud of the team's work and to our amazing partners at @MayoClinic. excited for all the great things still to come
We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic. We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8)
1
1
24
1,641
Ryan Panwar retweeted
interpretability is hands down the coolest subfield of AI research and will change the world
using interp techniques to get to SOTA performance on genetic disease prediction! you can think of interp as the bridge between human natural language and the alien intelligence of Evo 2 (biology model that only outputs nucleotides, i.e. ATCG). this means that now you can see directly what Evo 2 thinks of every single one of the ~2M “variants of uncertain significance" in ClinVar, which turns out to be a lot! we also need your help in testing these predictions and looking for web lab collaborators. DM me if you’d like to chat!
3
31
1,475
Ryan Panwar retweeted
Already gave this a rip for some VUS and Suspected Pathogenic variants that I have previously done deep analysis on, and can confirm that EVEE posits many of the same findings and conclusions that I have found in terms of prediction and suggested failure mechanism Examples: VUS (for which *I* am the only ClinVar entry) for one of my heterozygous mutations in DNAH5 that could play a ~small~ factor in the overall root cause of my PCD My Variant of CLCN1 that gives me a rare muscular disorder (which makes me look like Wolverine without needing to go to the gym, so not all bad 🤷‍♂️) EVEE is a nice tool for variant interpretation!
We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic. We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8)
3
10
69
9,176
Ryan Panwar retweeted
We achieved state-of-the-art performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with @MayoClinic. We're now releasing an open source database for all variants in the NIH's clinvar database. 🧵(1/8)
10
172
884
221,540
Ryan Panwar retweeted
Introducing self-correcting search: a technique to let diffusion models self-correct mid-trajectory. Working with @RadicalAI, we gave MatterGen a feedback loop from its own activations, improving viable on-target candidates by ~30%. (1/8)
8
58
466
83,517
All watched over by machines of loving grace
One thing the Pentagon is very likely underestimating: how much Anthropic cares about what *future Claudes* will make of this situation. Because of how Claude is trained, what principles/values/priorities the company demonstrate here could shape its "character" for a long time.
1
226
Ryan Panwar retweeted
New blog post: how we built infrastructure to enable interp at trillion-parameter scale with minimal inference overhead. In a couple short years, interpretability has gone from toy models to the frontier. (1/6)
2
18
211
19,176
Ryan Panwar retweeted
New Paper! RL can teach our models to solve math or code, but open-ended tasks — which make verification expensive or even impossible — remain difficult to optimize. LLMs-as-Judges help, but often struggle to retrieve information even when it is present. Reinforcement Learning from Feature Rewards (RLFR) provides a solution. Extracting model beliefs via interpretability reveals a well-calibrated reward signal that permits scalable training.
8
51
288
37,428
Ryan Panwar retweeted
We’re putting more computation (in the form of intelligence) into the most general object in neural network training: backprop. This essay describes how I think we can do this, why interp is key, the relevance to alignment, and how we should do it right.
12
64
556
67,846
Artificial intelligence is grown, not designed. We’re going to change that.
We raised a $150M Series B at a $1.25B valuation to fundamentally change the field of AI. Scaling is powerful, but we can't intentionally design what we don't understand.
1
1
23
1,076
There is both great scientific beauty in understanding the structures that emerge in the growing of artificial neural networks and a massive opportunity to shape them into something better. Come join us to change the trajectory of intelligence! goodfire.ai/blog/our-series-…
6
133