Unleashing machine learning to solve medicinal chemistry

Joined August 2021
Photos and videos
Leash Bio retweeted
BoltzGen, Protenix, ProteinMPNN — all trained on PDB. When your reward is also PDB-derived, you're learning "structure a crystallographer would deposit," not what a "good binder" is @leashbio showed the same failure in: ChEMBL models learn what chemists make, not what binds. 1/6
1
1
28
1,645
Leash Bio retweeted
Nature encodes function in many languages—from protein sequences shaped by evolution to animal communication and the chemistry of smell. Each is expressed through high dimensional patterns that are experimentally accessible yet resist neat, human-legible interpretation. In our upcoming livestream from the annual Flagship AI Summit, panelists @seemaychou (@ArcadiaScience ), @JCoolScience (@AnthropicAI), @gibsmk (Flagship Pioneering), @allmeasures (@leashbio) will explore how modern AI can decode these patterns and re-encode them into new designs. Stay tuned for details on how to watch on March 25th.
2
7
791
Leash Bio retweeted
Current protein models seem to only memorize their wild-type training sets and lack physics understanding. One nice way forward is physically engineering mutations and then taking more ground-truth measurements. It works. We've already shown this! (1/n)
Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design 1. A new adversarial study systematically evaluates AlphaFold 3's robustness by introducing point mutations (up to 70%) and deletions (up to 10%) across 200 proteins, revealing striking structural invariance that raises fundamental questions about the model's biophysical reasoning capabilities. 2. The most concerning finding: AlphaFold 3 maintains virtually identical predicted structures even when 40% of residues are mutated with deliberately destabilizing substitutions, or when 10% of residues are deleted—perturbations that would catastrophically destabilize real proteins. 3. This structural invariance persists even for experimentally validated fold-switching proteins, where specific mutations are known to induce alternative conformations. AlphaFold 3 fails to capture these biologically critical transitions, suggesting limited sequence-structure coupling. 4. Confidence metrics prove unreliable: AlphaFold 3's ranking score selects the most accurate structure only ~25% of the time, and these scores correlate more strongly with template availability in the training set than with actual prediction quality. 5. Comparative analysis with ESMFold reveals that the protein language model-based approach shows significantly greater sensitivity to mutations, with structures diverging more rapidly as sequence perturbations increase—suggesting superior learned sequence-structure relationships despite lower absolute accuracy. 6. The study's template analysis provides quantitative evidence that AlphaFold 3's confidence reflects structural similarity to training-set exemplars (Pearson r=0.39) rather than genuine biophysical assessment, indicating heavy reliance on memorized patterns over learned principles. 7. These findings have profound implications for the entire AlphaFold ecosystem: protein design tools like RFdiffusion, binder design methods like BoltzGen and BindCraft, and drug discovery pipelines may inherit these fundamental limitations, potentially generating non-physical sequences or missing viable candidates. 8. The work identifies critical gaps in current structure prediction—models trained primarily on stable, wild-type proteins lack exposure to destabilized mutants and misfolded states, limiting their ability to generalize beyond the training distribution. 📜Paper: biorxiv.org/content/10.64898… #AlphaFold #AlphaFold3 #ProteinStructurePrediction #StructuralBiology #ProteinDesign #MachineLearning #Bioinformatics #ComputationalBiology #AIforScience #ProteinEngineering #DeepLearning #Biophysics
3
16
115
15,550
Leash Bio retweeted
This week we showed near protein-ligand binding prediction SOTA on our lightweight model, Hermes. Unlike other models, Hermes has no way to think about protein or molecule structure. Never seen *any* public data. How did we do it? endpoints.news/leash-bio-war…
3
6
58
6,732

This week we showed near protein-ligand binding prediction SOTA on our lightweight model, Hermes. Unlike other models, Hermes has no way to think about protein or molecule structure. Never seen *any* public data. How did we do it? endpoints.news/leash-bio-war…
1
232
Leash Bio retweeted
Is your Drug Discovery AI actually learning chemistry, or is it just “profiling” the chemist? 🕵️ New research from @leashbio highlights a "Clever Hans" failure mode in AI: models can predict molecular activity with high accuracy simply by identifying who synthesized the molecule. The paper shows a classifier can guess the author of a molecule (among 1,815 options) with 60% top-5 accuracy. Just like human chemists, if the AI knows it’s a “Schreiber molecule”, it already knows the likely target! It’s memorizing human bias, not physics. 🧪 Why does this matter? If you’re developing a "Me-too" drug in a well-trodden space, these "shortcut" models work fine. But for the future of biotech—new targets, new scaffolds—we need models that can’t "cheat". They must learn the essence of binding. The @leashbio Path: By building massive, consistent internal datasets (DEL-based), Leash is removing "human intent" from the equation. The goal? True Zero-shot drug design that generalizes to the unknown by learning from billions of data points, not just the "favorite" molecules of famous PIs. 🚀 Is "cheating" acceptable for Me-too drug development? What's your take? Let's discuss! 👇 @owl_posting @leashbio @Andrewdblevins @IsomorphicLabs @BoWang87 @BiologyAIDaily #AI #AI4S #AIDD #DrugDiscovery #Leashbio #Chemistry #CleverHans #Benchmarking #DataLeakage [Original Text: leashbio.substack.com/p/ai-f…]
1
2
263
Leash Bio retweeted
25 Dec 2025
Had the pleasure of interning at @leashbio - genuinely great team with a serious commitment to rigor. Cool to see their work getting attention!
23 Dec 2025
An ML drug discovery startup trying really, really hard to not cheat owlposting.com/p/an-ml-drug-… on the 12-person, Utah-based startup @leashbio, their culture of rigor, and the many ways small molecule models accidentally learn the wrong thing
4
29
5,848
17 Jul 2025
We’re happy to share some of Leash’s latest ML results. Our model, Hermes, is competitive with Boltz-2 on binding prediction. Importantly, Hermes is trained exclusively on Leash DEL data. In other words, the plan is working. leashbio.substack.com/p/good…

1
4
577
Leash Bio retweeted
20 Jun 2025
Drug discovery: 🥇 Aikium 🥈 @serna_bio 🥉 @leashbio Biotech: 🥇 @Ataraxis_AI 🥈 @Feeltherapeutic 🥉 Nostics Genomics & multiomics: 🥇 @transcriptabio 🥈 @Converge_Bio 🥉 @infinimmune Healthtech: 🥇 MetaSight Diagnostics 🥈 @slingshotai_inc 🥉 @isonohealth
1
4
14
2,322
Leash Bio retweeted
Love the momentum behind @Polaris_HQ right now and super excited to see @leashbio's massive BELKA dataset up on the platform.
1 Nov 2024
Curious to learn more about the work that went behind making BELKA accessible through Polaris? Learn more about Dataset v2.0 and how we’re helping scientists focus on research, not on data wrangling. 👇🧵 Read the blog: polarishub.io/blog/dataset-v… Access BELKA: polarishub.io/datasets/leash…
2
20
1,059
31 Oct 2024

We just released 4.25B small molecule-protein measurements - real, physical ones from our lab - on @Polaris_HQ. It's the biggest such public collection ever, please use it! leashbio.substack.com/p/belk…
1
284
Leash Bio retweeted
We invite everyone interested in this problem to check out our ligand prediction Kaggle competition, which includes more physical measurements than all of Pubchem Bioassays. Open source FTW! kaggle.com/competitions/leas…
AlphaFold3 is out with improvements on structural models that include DNA/RNA and small molecules. Unfortunately, there is no code, no binary to run at scale and only a limited webserver. Why even publish? nature.com/articles/s41586-0…
7
15
4,687
Leash Bio retweeted
Started in the basement of Ian Quigley’s Utah home, Leash Bio has launched with an open challenge to the AI-bio world The startup is betting there's a need for more data to fuel the next AI breakthroughs in biology My latest: endpts.com/exclusive-ex-recu…
4
8
41
8,960
5 Apr 2024
We are also launching our inaugural machine learning Kaggle competition, Big Encoded Library for Chemical Assessment (BELKA), leveraging a dataset of unprecedented scale: kaggle.com/competitions/leas… #MachineLearning #AI #ML #TechBio #kaggle
1
5
6
2,208
5 Apr 2024
Today we announced a $9.3 million seed financing round to advance our mission of revolutionizing medicinal chemistry through modern computational methods and massive biological data collection. Read the press release: globenewswire.com/news-relea… #MachineLearning #AI #ML #TechBio
5
12
1,350