Scientist, ML at Tessera | Algorithms for Gene Writing™ | PhD from Salis Lab at Penn State | threads/X/bsky @bioalgorithmist | Opinions are Mine Only

Joined April 2017
Photos and videos
Pinned Tweet
Very happy to release our latest paper from @hsalis Lab in collaboration with @klavins Lab at UW on "Automated design of thousands of nonrepetitive parts for engineering stable genetic systems", now published in Nature Biotechnology! 1/18 nature.com/articles/s41587-0…

4
7
34
Ayaan Hossain retweeted
Feb 12
Human coders know they lost, but they keep fighting with windmills because they aren't ready to accept it.
15
8
88
15,749
Ayaan Hossain retweeted
Our work developing a parts list of promoters and gRNA scaffolds for mammalian genome engineering and molecular recording is now out @NatureBiotech
A parts list of promoters and gRNA scaffolds for mammalian genome engineering and molecular recording go.nature.com/49eTPCu
2
10
45
14,129
Ayaan Hossain retweeted
A parts list of promoters and gRNA scaffolds for mammalian genome engineering and molecular recording go.nature.com/49eTPCu
1
22
90
23,302
Ayaan Hossain retweeted
🧠 Why do smart scientists feel stupid when reading papers? Because nobody teaches you HOW to read them efficiently. This 3-pass system will change how you approach every paper: 🧵
4
30
143
14,405
Ayaan Hossain retweeted
26 May 2025
🚨 New preprint 🚨 We introduce Generative Distribution Embeddings (GDEs) — a framework for learning representations of distributions, not just datapoints. GDEs enable multiscale modeling and come with elegant statistical theory and some miraculous geometric results! 🧵
6
134
753
80,959
Ayaan Hossain retweeted
New lab preprint! 🚀 Modeling complex data distributions is tough. We designed GDEs, a new framework that tackles this head-on! GDEs generalize across text, images & MANY bio apps (think virtual cells, spatial bio, viral genome tracking). Thread 👇
3
6
60
18,795
Ayaan Hossain retweeted
If you have a solid strategy and a small amount of compute, you can go pretty far. If you have huge clusters of GPUs and no strategy, your only achievement will be burning capital.
33
70
823
42,375
Ayaan Hossain retweeted
13 Nov 2024
This is a great paper from the @hsalis lab. - Measure the decay rates of 50,000 mRNAs in bacteria. - Use biophysical models ML to build models of mRNA stability. - Profit. And a good reminder of what's possible when one turns a biological problem into a sequencing problem!
1
24
88
7,978
Ayaan Hossain retweeted
I am pleased to announce our latest publication ‘Predicting synthetic mRNA stability using massively parallel kinetic measurements, biophysical modeling, and machine learning’ in @NatureComms
1
2
4
165
We applied rational learn-by-design 🔢 methods to create a maximally informative library 📚, coupled that with high throughput, barcoded, massively parallel reporter assays to decrypt major design rules using Gradient Boosted Trees 🌳!
1
1
115
Congratulations to co-authors @grace_vezeau and massive thanks to Prof. @hsalis for making us part of this exciting project! 🙏
1
92
Ayaan Hossain retweeted
Thanks for the highlight @doescience! Read more about our research: nature.com/articles/s41586-0… biorxiv.org/content/10.1101/…
Engineered bacteria are great for making biofuels and other bioproducts, but you don’t want those changes escaping to the wild. Scientists @Harvard & @WUSTLmed engineered the E. coli genome to keep useful edits from spreading in an uncontrolled way: energy.gov/science/ber/artic…
6
30
4,910
Ayaan Hossain retweeted
In our newest preprint, • we explore the effects of synonymous genome recoding, and • construct & troubleshoot a synthetic 57-codon E. coli genome using multi-omics, editing, and laboratory evolution. 1/n
5
48
147
25,995
Ayaan Hossain retweeted
After almost two years of optimization, SynOMICS's plasmid system is a fusion of @dbikard @MarraffiniLab's Cas9 & the pORTMAGE plasmids & ELSA/nonrepetitive parts by @hsalis @bioalgorithmist Alex Reis & co I highly recommend using multiplexed sgRNAs from nature.com/articles/s41587-0…

1
2
8
957
Please RT: If there's an old unmaintained software that is prohibitively slow and is only one of its kind do you not benchmark your new tool against it? How to properly address reviewers and frame this issue in paper?
2
2
3
1,715
Ayaan Hossain retweeted
If you are using Language Models to predict gene expression, you better compare with latest state-of-the-art biophysical or ML models
28 Feb 2024
Replying to @anshulkundaje
Our bacterial Promoter Calculator model has 348 parameters and predicts cont-val TX rates with R^2=0.80 (other factors constant). Our bacterial RBS Calculator has 12 free parameters and predicts cont-val TL rates with R^2=0.75. The bar is pretty high, but they don’t compare.
1
6
30
11,532
Ayaan Hossain retweeted
23 Feb 2024
Do you need a billion parameters and millions of sequences for #proteindesign? Maybe not! See our paper: shorturl.at/novMO We show that interpretable, non-autoregressive structure-based protein design can work! Original thread: shorturl.at/cdgCP

3
31
189
23,040
Ayaan Hossain retweeted
Today we announced FDA clearance of our IND application for SENTI-202, a potential first-in-class, logic-gated treatment for acute myeloid leukemia. Learn more: bit.ly/3v4IMJY $SNTI #AML #CARNK
6
9
30
4,593