Joined July 2022
15 Photos and videos
Pinned Tweet
RL finetuning for protein binder design is being hailed as the next obvious step. But I ran the numbers, and something uncomfortable is happening under the hood. đź§µ
3
13
118
13,511
multimodali retweeted
introducing tubestack: bringing substack’s zen to youtube a quiet, local desktop app to explore youtube without visual pollution. download here for free on mac & windows: anishlk.com/tubestack (typography taken from the pretty blogs of @AnthropicAI, @OpenAI, @X, @Substack)
6
12
26
2,077
We set out to audit the RL results for IDiom, a protein language model for designing intrinsically disordered regions. The question was simple: Did RL learn specific localization, or did it learn sequences that score broadly well for related compartments? The results đź§µ
Protein design has been dominated by diffusions due to a "structure-first" perspective. What about intrinsically disordered proteins? We scale language-based design using the modern RL stack and our model IDiom. Paper: biorxiv.org/content/10.64898… Try it: idiom-designer.vercel.app/
2
6
54
7,075
6/7 This is where things got interesting. Stress-granule RL sequences scored highly for stress granule, but also scored highly for P-body. So the target reward improved, but ProtGPS itself didn’t cleanly certify fine-grained specificity.
1
5
309
7/ Takeaway: Main takeaway: reward gains are useful, but they shouldn’t be read alone. For RL protein design, composition controls, off-target scores, specificity margins, and independent checks make the biological claim much clearer. Full writeup: auralie.substack.com/p/a-fri…

5
273
Exactly - biology is a fundamentally different domain than text and scaling laws do not apply cleanly ~All the tasks you want an LLM to do are contained in the text data itself. For biology, NONE of the tasks you want the model to do are contained in the sequence data itself.
4
10
169
13,066
multimodali retweeted
"The Bitter Lesson has fully arrived in sequence biology and protein structure. Evo 2, AlphaFold 2 and 3, ProGen3, RFdiffusion". This sentence has some issues IMO. 1/
12
55
392
67,675
multimodali retweeted
Excited to share our new preprint: “Computational design of membrane fusion proteins” Huge thanks to all collaborators, co-authors, @KingLabIPD, and everyone at @UWproteindesign who contributed to this work. Preprint: biorxiv.org/content/10.64898…

3
30
70
17,643
multimodali retweeted
Hiring one intern in the next 48 hours. Requirements: - Use agents - Currently in university - Have coded with GPUs - Can start now with free time over the next week Science, computing infra, or cloud is a plus. Will involve biology. Apply by replying with your GitHub username
20
3
53
4,351
multimodali retweeted
i think we are all wondering what goes through the minds of the admissions committee period
I know this topic’s been discussed before but I really wonder what goes through the minds of the admissions committee when they see kids like this
2
9
988
multimodali retweeted
Our paper STRIDE was accepted to ICML 2026! We post-train LLMs to optimize proteins & molecules by emitting a chain-of-thought of atomic edits (INSERT / DELETE / REPLACE). Levenshtein-shortest-path SFT GRPO-style RL. Boosts protein optimization success from 42% → 89%.
2
10
65
3,502
multimodali retweeted
new preprint alert! tl;dr we made a global tokenizer for proteins
6
32
262
22,176
multimodali retweeted
I am happy to share a review I recently wrote on the design of peptide binders. It gives an overview of experimentally validated tools and discusses the challenges of why peptide design is more difficult than the design of classical protein binders. chimia.ch/chimia/article/vie…
3
67
308
18,499
late to the party, but let me know if you'll be here!
1
126
Always a good time when there's something Codex can't figure out and you get to be useful ;))
2
141
multimodali retweeted
A couple of months ago, I announced that I was partway through implementing a simple, readable AlphaFold2 in pure PyTorch, inspired by @karpathy's minGPT. Today, I'm happy to share minAlphaFold2 - the completion of that project. Repo link: github.com/ChrisHayduk/minAl…
19
113
755
51,650
RL finetuning for protein binder design is being hailed as the next obvious step. But I ran the numbers, and something uncomfortable is happening under the hood. đź§µ
3
13
118
13,511
The pattern: collapse severity tracks PDB representation of the target. Heavily represented → already collapsed, RL changes little. Underrepresented → base model has real diversity, RL destroys it. 6/6
1
9
3,651
link to the full article - auralie.substack.com/p/takin…

1
2
12
1,561