Filter
Exclude
Time range
-
Near
Self-supervised learning for a gene program-centric view of cell states 1 Tripso is a self-supervised transformer framework that represents each cell with multiple gene program (GP)-specific embeddings (rather than a single entangled cell embedding), enabling program-resolved comparisons across development, disease, and experimental systems. 2 Architecture in brief: a gene encoder learns contextualized gene embeddings within each cell; genes are routed into dedicated GP-specific transformer blocks (each with a CLS token summarizing the program); a global cell block then attends over GP embeddings and is trained to reconstruct counts with a negative binomial loss. 3 Beyond curated programs, Tripso includes a data-driven GP discovery mode: attention patterns among genes are used to (i) rank genes relevant to specific states and (ii) cluster genes with similar attention profiles into novel, context-specific programs. 4 Interpretability is built in at two levels: gene-to-GP importance via cosine similarity between gene embeddings and the GP CLS token; and GP-to-cell importance via systematic ablation of each GP embedding and measuring the induced change in the cell representation. 5 Benchmarks on a large Perturb-seq resource (623k cells; TNFα/TGFβ stimulation 98 perturbations) show GP embeddings that better separate pathway stimulations and stronger genetic perturbations than Spectra (NMF) and Expimap (interpretable VAE), and outperform non-ML baselines (gene-set scoring; concatenated expression of GP genes). Tripso also improves robustness to batch effects compared with raw expression in GP space. 6 In human hematopoiesis across the lifespan (~499k in vivo cells from 98 donors in vitro corpora), Tripso recovers expected lineage programs (e.g., GATA1 in erythropoiesis; RUNX1 in myeloid/megakaryocyte differentiation) and exposes age-specific GP shifts, including elevated pediatric JAK-STAT importance in HSC/MPP populations with gene-level signals enriched for type I interferon response. 7 Tripso resolves developmental changes in early B-lineage states specifically within the IKZF1 GP embedding: Milo differential abundance in IKZF1 space separates prenatal vs postnatal pro-B neighborhoods, while an unrelated control GP (WNT) does not. Gene-level importance suggests a shift from prenatal proliferative/IL7R-linked programs toward postnatal pre-BCR diversification (e.g., IGLL1/VPREB1, DNTT). 8 For in vivo vs in vitro mapping, Tripso supports GP-anchored alignment using unbalanced optimal transport (e.g., in GATA1 GP space it recapitulates a truncated in vitro erythroid trajectory missing terminal erythroblasts without using prior knowledge of the protocol). 9 Tripso enables actionable perturbation prioritization for HSC culture: focusing on PI3K as an in vivo HSC-distinctive GP, distributional comparisons in PI3K GP space (Sinkhorn divergence) indicate 3a culture is closest to adult BM LT-HSCs. Gene-level importance within PI3K nominates ER translocon components (SSR1; and SEC61G in a related GP) as higher in less stem-like states. 10 Experimental validation: inhibiting the SEC61 translocon (SEC61-IN-1) increases the frequency of immunophenotypic HSCs (CD34 CD45RA− CD90 EPCR ) in UM171 and SR-1 cultures (but not in 3a, consistent with the prioritization setup), illustrating how GP-resolved signals can identify candidates that would be hard to rank by small-effect differential expression alone. 11 In inflammatory skin (1.7M cells across 338 biopsies, 14 diseases), Tripso GP discovery (restricted to a spatial panel for direct validation) yields programs with limited one-to-one overlap with PROGENy/MSigDB, capturing novel gene combinations. A lymphoid program (GP23) shows an atopic-dermatitis-selective profile, elevated in IL13 TRM cells, and is enriched for inflammatory signaling, metabolic adaptation, and trafficking/turnover genes not well covered by canonical immune annotations. 12 Spatial validation with matched Xenium transcriptomics and spatial proteomics links GP23-high regions to discrete immune-dense niches adjacent to sebaceous glands and inflamed epidermis, frequently co-localizing with high CD45 protein signal and proximity to T cell aggregates; GP23 remains elevated in relapsed AD after treatment withdrawal, consistent with niche-associated TRM persistence. 📜Paper: biorxiv.org/content/10.64898… #SingleCell #scRNAseq #Transformers #SelfSupervisedLearning #GenePrograms #Interpretability #Hematopoiesis #StemCells #SpatialTranscriptomics #AtopicDermatitis
3
13
1,174