What could Alphafold 4 look like? (Sergey Ovchinnikov, Ep #3)
2 hours listening time
(links below)
To those in the (machine-learning for protein design) space, Dr. Sergey Ovchinnikov (
@sokrypton) is a very, very well-recognized name.
A recent MIT professor (circa early 2024), he has played a part in a staggering number of recent great papers in the field: ColabFold, RFDiffusion, Bindcraft, automated design of soluble proxies of membrane proteins, elucidating what protein language models are learning, conformational sampling via Alphafold2, and many more. Of course, all these papers were group efforts, but Sergey's name comes up astonishingly frequently!
And even beyond the research that have come from his lab in the last few years, the co-evolution work he did during his PhD/fellowship also laid some of the groundwork for the original Alphafold paper, being cited twice in it.
This is a two hour conversation with him, asking every question I could think of. We talk about his own journey into biology research, an issue he has with Alphafold3, what Alphafold4-and-beyond models may look like, what research he’d want to spend a hundred million dollars on, and lots more.
Topics/institutions we discuss:
@arcinstitute's Evo models,
@HWaymentSteele's work,
@IsomorphicLabs's AF2/AF3, and
@EvoscaleAI's ESM models
Also, extremely grateful to Asimov Press (
@asimovpress) for helping fund the travel studio time required for this episode! They are a non-profit publisher dedicated to thoughtful writing on biology and metascience, such as articles over synthetic blood and interviews with plant geneticists. I myself have published within them twice! I highly recommend checking out their essays at
asimov.press, or reaching out to editors@asimov.com if you’re interested in contributing.
Timestamps:
[00:00:00] Highlight clips
[00:01:10] Introduction Sergey's background and how he got into the field
[00:18:14] Is conservation all you need?
[00:23:26] Ambiguous vs non-ambiguous regions in proteins
[00:24:59] What will AlphaFold 4/5/6 look like?
[00:36:19] Diffusion vs. inversion for protein design
[00:44:52] A problem with Alphafold3
[00:53:41] MSA vs. single sequence models
[01:06:52] How Sergey picks research problems
[01:21:06] What are DNA models like Evo learning?
[01:29:11] The problem with train/test splits in biology
[01:49:07] What Sergey would do with $100 million