Synthetic Biology & Synthetic Genomics @ Imperial College London and the Sanger Institute. Bilingual in English and DNA. D-/L-

Joined January 2011
1,903 Photos and videos
Pinned Tweet
I asked Gemini to make an advert for hiring for a new postdoc to my lab. I gave it our recent group photo at Darwin's House and all the job details. It made some terrible ones, but I like this one best.
5
11
52
4,872
Didn’t expect Fable to be the start gun for enshittification of AI, but here it is.
1
1
12
1,679
Tom Ellis retweeted
Biologists win again
4
37
468
16,573
Tom Ellis retweeted
The paper has many more insights that I hope will help make plant transformation easier and more predictable. biorxiv.org/content/10.64898… And we’ll make the plasmids available through Addgene as ASAP! Huge thanks to everyone involved in @MoKhalilLab and Mary Gehring labs!! 16/16
1
3
8
1,053
Tom Ellis retweeted
New preprint out 🌱 We present a new T-DNA vector system for Arabidopsis that supports clean, genomically mapped, single-copy T-DNA insertion with predictable cell-type/conditional gene expression. @MoKhalilLab Gehring labs biorxiv.org/content/10.64898… 🧵1/16
2
29
69
17,568
Tom Ellis retweeted
I’ve come to the view that AI in biology will actually boost measurement, e.g. sequencing. AI in biology means more constructs, more edits, more synthetic genomes, more single cells, more hypothetical proteins to express and characterize, more validation, more perturbation of cells, best measured by deep sequencing of transcripts/proteins etc.
2
4
16
2,017
5 days left now to apply for the postdoc opportunity in my lab at Imperial in London 🇬🇧 - there’s a chance that we can hire 2 people into the team on this synthetic biology and materials theme. Application link is here - imperial.ac.uk/jobs/search-j…
22
41
4,662
Tom Ellis retweeted
Evaluating DNA Function Understanding in Genomic Language Models Using Evolutionarily Implausible Sequences 1. The paper introduces Nullsettes, a benchmark designed to test whether genomic language models (gLMs) truly understand gene-expression function, rather than just matching evolutionary patterns seen in natural genomes. 2. Nullsettes creates in silico loss-of-function (LOF) mutations in synthetic expression cassettes by translocating key elements (e.g., promoter, start codon, CDS, stop codon, terminator; plus RBS in prokaryotes) so the canonical 5′→3′ regulatory architecture is broken while keeping all parts present. 3. The benchmark is intentionally “evolutionarily implausible”: nonmutant cassettes are functional yet low-likelihood under gLMs because they combine elements from distant species (e.g., GFP CDS with heterologous regulatory parts) and often use random but functional promoters from MPRA libraries. 4. Evaluation is zero-shot: for each cassette, a model scores the nonmutant and its Nullsettes mutants using sequence log-likelihood (or a model-specific pseudo-likelihood proxy), and a mutation type is considered detected if mutants show a statistically significant likelihood drop vs the nonmutant (paired permutation test with multiple-testing correction). 5. Across 14 state-of-the-art gLMs (35 variants), most models fail to reliably detect strong LOF mutations; 11/14 models drop below 50% success rate on at least one dataset, indicating poor generalization to engineered constructs that deviate from natural sequence statistics. 6. A key diagnostic finding is that prediction accuracy depends strongly on the model likelihood of the original (nonmutant) cassette: as nonmutant log-likelihood decreases, LOF detection collapses across nearly all models, consistent with reliance on evolutionary priors/pattern matching rather than mechanistic reasoning about transcription/translation. 7. Performance also worsens when promoters come from random-sequence libraries vs naturally derived promoters (significant paired drop across models), reinforcing that many gLMs behave sensibly mainly when evolutionary plausibility is a usable proxy for function. 8. Counterintuitively, more disruptive mutations (those breaking more steps of gene expression) are not easier for models; instead, success declines as disruption severity increases, suggesting models are not robustly tracking regulatory logic even when the functional failure should be obvious mechanistically. 9. Model comparisons suggest scaling is not the main lever: GENERanno-0.5B matches Evo2-7B despite far fewer parameters and less pretraining data, implying curated, function-relevant pretraining data can matter more than sheer size for functional generalization; AlphaGenome (supervised sequence-to-expression) is competitive on transcription-disrupting subsets but cannot be evaluated on translation-only disruptions. 💻Code: github.com/cellethology/GLM-… 📜Paper: doi.org/10.1021/acssynbio.6c… #Genomics #SyntheticBiology #Bioinformatics #MachineLearning #FoundationModels #DNA #MPRA #RegulatoryGenomics #Benchmarking #AI4Science
1
9
27
3,454
Tom Ellis retweeted
The new understanding of childhood allergies is a really positive way forward towards reducing the numbers of children who suffer from the allergic reactions. I was wondering how this could be optimised to make it easier for parents to cover all potentially allergenic foods.
Jun 8
After the drastic change in guidance to no longer keep allergenic foods away from babies until 1 to 3 years of age and instead introduce them by 6 months of age, the prevalence of egg allergy among children fell by more than 17% in a new study published Monday in the journal JAMA Pediatrics. cnn.it/4v50rM0
4
7
11
3,182
Tom Ellis retweeted
Incredible headline. Hats off to everyone who approved it.
Version of AI tool too powerful for public released to public bbc.in/4xfGSlq
12
396
10,453
799,584
Tom Ellis retweeted
The superior approach is to name them after Pokemons
Hundreds of scientists who study cancer and aging have made an easily avoidable but significant mistake, deploying the wrong antibody to test for a key protein, according to a researcher who exposes errors in the biomedical literature. Instead of antibodies that recognize p16INK4a, a tumor suppressing protein that may also promote aging, these researchers used antibodies that tag the similarly named protein p16-ARC, which helps shape the cell’s molecular skeleton. Learn more: scim.ag/4uZGvdl
4
31
3,205
Replying to @mgdurrant
How does one become a trusted partner and what are the current views Anthropic has about disproportionate access to such models by only the most entrenched biopharmas, which could lead to a monopoly of pace and highly anti-trust triggering?
1
3
71
3,459
Tom Ellis retweeted
What a great day and place to have a conference! Thank >227 attendees for participating in #SynBYSS Barcelona conference! A snap shot taken when people are still discussing research, not looking at the camera, showing complexity and diversity of engineering biology. @marcguellc
1
4
28
2,393
Tom Ellis retweeted
alternative protein/acc 🔥
We're launching a $10 million RFP for R&D on one of the biggest barriers to alternative protein adoption: taste. 🧵
1
2
27
2,330
Is this a $8.5B valuation for a 3 year old buzzwords company, or do Lila actually have an exciting product?
Replying to @peterottsjo
Lila Sciences may raise about $2 billion at an $8.5 billion valuation. That number looks incredible if you treat Lila as a normal AI biotech. It makes a little more sense (maybe!) if you treat it as a bet on a new R&D operating system. Lila’s pitch is AI Science Factories: AI systems connected to automated labs that generate hypotheses, design experiments, run them, learn from the results, and repeat. In other words: model, robot lab, experiment, data, repeat. The whole shebang. Or “scientific superintelligence” and “operating system for science," as Lila calls it. (3/7)
7
3
74
25,798
Cool new ELMs work from @obermeyergroup >>> Catalytic Bacterial Nanocellulose Composite That Captures and Degrades PET Microplastics biorxiv.org/content/10.64898…

1
1
7
1,283
“business casual eugenics”
What are we afraid of? Some possibilities (1) genetic determinism (2) mutant babies (3) business casual eugenics (i.e. genetic enhancements marketed by people in puffer vests.) (4) unrealistic parental expectations
1
2
10
3,007
Tom Ellis retweeted
Excited to share our new preprint "Genetic code expansion enables plant-directed control of bacterial activity", a fantastic collaboration with @kunjapur lab! biorxiv.org/content/10.64898…

2
19
51
4,817
Tom Ellis retweeted
good time to mention that the system paper for SecureDNA that i worked on was recently published! it uses distributed OPRFs for cheap (it's free to use), secure, attack-resistant, cryptographically-blinded DNA synthesis screening. it's available right now! securedna.org
Jun 4
SITUATION DETECTED: Sam Altman, Dario Amodei, and Demis Hassabis have signed a joint open letter calling on Congress to mandate screening of synthetic nucleic acid orders, citing AI’s rapidly improving ability to assist with biological research as an urgent biosecurity risk.
4
10
87
12,119
Huge new work from Japan - radically engineered E.coli genome now half the size of the usual genome. 🧬 🦠 Impressive work… minimising genomes is on the march!
Generating E. coli 0.5 controlled by a half-sized genome biorxiv.org/content/10.64898…
3
18
128
13,056