Bashor Lab in Houston, Texas | Mammalian Synthetic Biology et al.

Joined December 2022
57 Photos and videos
Pinned Tweet
We are pleased to share a paper from our lab out in this week’s issue of @Nature, where we show that HT ML can dramatically speed up the synbio DBTL cycles, profiling gene circuit design spaces at unprecedented scale: nature.com/articles/s41586-0… (1/16)
2
34
128
15,178
Bashor Lab retweeted
No scaling laws for single-cell foundation models: when bigger atlases stop teaching the model anything In language and vision, the recipe has been simple: more data, bigger models, better performance. Single-cell biology borrowed that playbook. Foundation models for transcriptomics jumped from 1 million cells to atlases of over 100 million, on the assumption that scale would unlock the same gains. Alan DenAdel and coauthors put that assumption to the test, and the result is sobering. Working from a 22.2-million-cell corpus, they pretrained 400 models across five architectures (from PCA and a variational autoencoder up to the Geneformer transformer) and ran 6,400 evaluation experiments. They varied not just dataset size (1% to 75%) but also diversity, using cell-type re-weighting and geometric sketching to deliberately enrich rare cell types and transcriptional states. The finding: performance saturates almost immediately. On cell-type classification, batch integration, and perturbation prediction, most models hit their ceiling at roughly 1% of the corpus, about 200,000 cells. Beyond that, adding millions more cells changed essentially nothing. More diversity didn't help. Even spiking in genome-scale Perturb-seq data, to give the models perturbed phenotypes rather than just healthy ones, failed to move the needle. Larger models did score better overall, but they too plateaued early on data. Two points stood out. Simple baselines (PCA, logistic regression) often matched or beat the transformers. And the strongest model, SCimilarity, won not because of size but because its contrastive training objective is aligned with the downstream task. For single-cell data, what you train on and how you frame the objective matters far more than how much you collect. This reframes a quiet but expensive habit. In drug discovery, biotech, and any pipeline leaning on cell atlases, the instinct to keep scaling pretraining corpora may be burning compute for no return. The real leverage sits elsewhere: curating high-quality, task-relevant data and matching the training objective to the actual question you're trying to answer. Paper: DenAdel et al., journal license | doi.org/10.1038/s41592-026-0…
15
93
381
95,431
Bashor Lab retweeted
Replying to @arjunrajlab
Nothing to see here.
2
22
72
8,208
Bashor Lab retweeted
Excited to share the first pre-print from our lab!! Check it out here! biorxiv.org/content/10.64898… We found that many RNA-binding proteins understood to regulate RNA processing can also function like transcription factors and cofactors to directly regulate transcription.
4
79
337
20,196
Bashor Lab retweeted
Exciting breakthrough technology from the lab, now live in @CellCellPress ! Instead of cutting the genome where proteins bind (e.g., Cut&Tag), D&D-seq scars the DNA with a deaminase, allowing single cell genome mapping of TFs and chromatin remodellers!
17
168
647
52,969
Bashor Lab retweeted
The preprint from my work @MoKhalilLab and @DunlopLab at BU is out on bioRxiv! As new tools come online to engineer multicellularity, we asked: how does sticking cells together into larger groups affect their fitness and function?
1
14
45
15,301
Bashor Lab retweeted
Lovely study by the team, showing how TF specificity emerges from weak and multivalent interactions! (a property we've been obsessed with modeling, designing and engineering in cells 😍)
Out now in Science! Our new study challenges long-standing assumptions about transcription factor specificity in eukaryotes. Novel single-molecule measurements of TF behavior in living cells reveal an independence of locus-specific binding from DNA sequence recognition.🧵 science.org/doi/10.1126/scie…
1
6
44
12,606
Bashor Lab retweeted
Characterizing AI-designed proteins requires quantitative biochemistry at massive scale. Enter Amplicon/Protein Bead Display (APB-Display), a fully in vitro platform that quantifies Kd's for >100,000 variants in <3 days (preprint link below!) @Stanford_ChEMH @czbiohub (1/n)
3
97
439
62,330
Programmable DNA integration with New-to-Nature tools using Computational Protein Design biorxiv.org/content/10.64898…

5
27
4,151
Bashor Lab retweeted
Can someone start a journal called “Cell Atlases” so that the rest of the journals can go back to publishing interesting things?
31
118
1,082
79,516
Bashor Lab retweeted
🚨 1/5 Check out the 2nd preprint from our lab on how IRES-mediated translation of synthetic circRNAs is employed in cells and in cell-free translation extracts, a highly collaborative effort with Immagina & the labs of Anders Lund and CK Chen.
3
12
62
7,108
Bashor Lab retweeted
Now available as its peer-reviewed version in Nature: nature.com/articles/s41586-0…
Thrilled to share our new paper where we introduce a multiplexed hydrogen–deuterium exchange MS (mHDX‑MS) method that can measure hundreds of protein domains’ conformational energy landscapes—all in a single experiment! biorxiv.org/content/10.1101/…
4
64
245
56,518
Bashor Lab retweeted
In this project with Tyler Dao, Aviv Regev, Alex Shalek, and others (@shaleklab @broadinstitute @MIT @ragoninstitute), we combined genetics with molecular biology, computational protein modeling, and imaging to investigate T cell receptor signal branching. (2/4)
2
1
1
865
Bashor Lab retweeted
Excited to share our work on ErbB receptors published today @CellCellPress! Using multicolor, photostable UCNPs, we perform long-term (>15 min) single-particle tracking of EGFR, HER2, and HER3, enabling direct visualization of dimerization in live cells. cell.com/cell/fulltext/S0092…
2
51
244
30,563
Bashor Lab retweeted
The size of new DNA sequences that can be integrated into the human genome is a foundational constraint for engineering and enhancing human cells. In a new collaborative study, we’ve now almost doubled the maximum size of DNA sequences that can be efficiently inserted into primary human cells. biorxiv.org/content/10.64898…

5
37
125
32,091
Bashor Lab retweeted
New preprint alert - 5 years in the making! Using high-throughput microfluidic enzyme kinetics, we profiled 190 clinical variants of SHP2, a phosphatase linked to developmental disorders & cancer (1/8) biorxiv.org/content/10.64898… @Stanford_ChEMH, @czbiohub, @bioe_stanford
5
39
191
13,701
Bashor Lab retweeted
An autonomous system for multi-objective continuous evolution at scale biorxiv.org/content/10.64898… #biorxiv_bioeng

10
23
17,048
Bashor Lab retweeted
Finding new medicines is getting more and more expensive, and AI won't help much unless we can generate physiological data at scale. In our new preprint, @GordianBio extends the progress of the functional genomics community to run pooled in vivo screens at scale, in a way that answers questions about physiology and therapeutic potential. We show screens in mice and horses, fibrotic and degenerative disease, with a framework for physiological predictions validated in human ex vivo tissues. Very proud of @v_sontake, @vkartha88, Neety and the rest of the team. Tweetorial follows:
9
58
201
38,602