deep learning research bio applications

Joined December 2021
22 Photos and videos
Adding to @owlposting's remark on the capabilities of Evo 2: There's a great paper by @gagneurlab showing that DNA LMs implicitly learn Watson-Crick base pairing from tRNA DNA. biorxiv.org/content/10.1101/…
20 Feb 2025
A socratic dialogue over the utility of DNA language models (Part 1 of 2) here's the link: owlposting.com/p/a-socratic-… and here's a longpost of why i wrote this: i think the effort that went into Evo 2 is very cool and its clearly a very comprehensive paper but the excitement over it made me realize that i didn't understand a more basic concept: what's the point of a DNA language model? it felt like all the instinctive 𝕏 takes i read about them were just...wrong at worst, and overly optimistic at best. im sure a Real Genomics person would instinctively understand the utility of such a type of model. but i do not! this is made worse by everyone i know irl agreeing that they too dont really get the point of models like these this essay is an attempt to rectify my own understanding and hopefully help others too. i interleave in my own instinctive questions with the answers i stumbled across as i researched more. unfortunately, i have many dumb questions, but hopefully some smart ones too part 1 is specifically focused about variant pathogenicity prediction using these models i should note that this essay is not about Evo 2 specifically. Evo 2 is referred to heavily, specifically their pathogenic variant discovery results, but i do not spend much time on the data/model/etc results. it is intended to be more broad than that
2
3
21
2,480
Upon mutation of one of two each-other-binding nucleotides, the DNA LM adapts its probability for the other one according to Watson-Crick. But of course, MSAs co-variation analysis detects the same yet require explicit analysis and more effort, so the DNA LM has some benefit.
1
1
163

Really awesome paper from @gagneurlab on a new, clever interpretation approach showing that local DNA language models (trained on pre-defined functional elements across species) are capable of learning regulatory & structural syntax. biorxiv.org/content/10.1101/…
1
164
PLMs are great at mutation pathogenicity prediction, but how about functional effects like enzyme activity? --> PLMs don't work for functional effects, but can be fine-tuned with an inductive bias regression head to perform better biorxiv.org/content/10.1101/… Our findings - 1/n:
3
12
79
7,935
-> New pre-training objectives with deeper biological insights Sequence & structure was the first frontier. But biology lives & dynamically interacts and modelling that is the next frontier imo Paper relating to that, by @francescazfl & @KevinKaichuang: biorxiv.org/content/10.1101/…
1
5
281
And a big thank you to @braegelmannlab for the thoughtful discussions & review of this work! Also thanks to @GoogleColab for providing the free (yet old) GPUs I used for this project😅
3
230
#CTNNB1 mutations are an intriguing prognostic & predictive factor in #NSCLC! (detailed paper in progress) In the meantime, I'm thankful for the Young Investigator Award from @dgho_eV recognising this important research:)
6
459
To what extent can we integrate o1 like CoT „reasoning“ and test-time compute into PLMs/ESM3? @amelie_iska @alexrives I guess it would improve performance likewise e.g. for conditioned generation
1
258