Single-cell technologies now let us profile entire transcriptomes in individual cells. But how do we make sense of this complexity in a biologically meaningful way? Many methods summarise cells into a single embedding, but this often comes at the cost of interpretability, especially when multiple gene programs are active at once.
We developed Tripso, a self-supervised transformer model that represents cells through multiple gene program-specific embeddings, while also uncovering new programs directly from the data. Instead of collapsing biology into a single vector, Tripso decomposes cell state into multiple representations, each reflecting a different gene program.
We explored this across multiple systems.
In human hematopoiesis, spanning development to aging, Tripso identified distinct age-associated program activity, including stronger JAK-STAT signalling in early life and dynamic IKZF1-related changes during B cell maturation.
By comparing in vitro culture conditions with in vivo hematopoietic stem cell states, Tripso suggested that targeting the SEC61 translocon could enhance stem cell maintenance ex vivo, a prediction that we subsequently validated experimentally. In parallel, we identified a previously uncharacterised tissue-resident memory T-cell program associated with atopic dermatitis and mapped it to distinct spatial immune niches
Together, these results show how modelling cells through gene programs can lead to interpretable and experimentally testable insights. More broadly, this work points toward a more interpretable and biologically grounded models of cell state. As single-cell datasets continue to grow, we hope approaches like Tripso will help bridge the gap between data-driven representations and biological insight.
This work wouldn’t have been possible without the contributions of an amazing team. Thank you to co-first authors
@mariemoullet,
@Tomo_Isobe,
@AmirhVahidi,
@CarloLeonardi7, and everyone from
@roserventotormo's Lab,
@HaniffaLab, Nicola Wilson and
@BertieGottgens's Lab, bringing together expertise across
@SCICambridge,
@OpenTargets,
@sangerinstitute and
@Cambridge_Uni.
@mariemoullet is one of the very best PhD students I have ever supervised. She is truly a force of nature, exceptionally resourceful, deeply innovative, and one of the most impressive scientists I have worked with. I am immensely proud of her and all that she has accomplished. As she begins her internship at
@genentech , I have no doubt she will do amazing work there and continue to make her mark.
paper:
biorxiv.org/content/10.64898…
code:
github.com/Lotfollahi-lab/tr…