Bioinformatics, immunogenetics, high-throughput immune repertoire sequencing. Decoding adaptive immunity.

Joined May 2014
53 Photos and videos
Pinned Tweet
N.B. VDJdb is up and running after a planned server upgrade, it can be now accessed at vdjdb.com, the old vdjdb.cdr3.net URL redirects there and should be considered obsolete. If you are using VDJdb web API please update accordingly.

1
5
486
Mikhail Shugay retweeted
This is an insane paper and I love it arxiv.org/abs/2605.31514
157
1,306
11,217
620,122
Mikhail Shugay retweeted
Capturing and Tracking Clonal T-cell Response to Cancer Neoantigens brnw.ch/21x36aN @ChudakovM @antigenomics
11
27
1,796
Mikhail Shugay retweeted
My first co-corresponding author paper! Excited to contribute to this work on tracking ultra-rare lymphoma cells using TRUST4, which assembled BCRs for the SMART-seq2 data with the read length of only 35 bp.
Our lab paper by Ran Xu @Ran47800515 now in print @BloodAdvances . New technology for ultra-rare MRD cell capture for mechanistic discovery: Live-cell Pick-Seq (LiP-Seq): Interrogating ultra-rare mantle cell lymphoma persistent cells after CART19 therapy ashpublications.org/bloodadv…
2
5
19
4,884
Mikhail Shugay retweeted
Big progress vs cancer, folks. The kind of event curves from randomized trials that we've not seen before for a couple of the most deadly cancers. Congrats to the oncology research community for getting these trial done. #ASCO26, @ASCO
36
490
2,449
124,404
Mikhail Shugay retweeted
Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.
74
448
1,612
599,580
Mikhail Shugay retweeted
In a Perspective article, scientists summarize the results of recent studies on how immune systems learn how to target previously unseen variants of pathogens and discuss how statistical physics could help address key unanswered research questions. 📃 go.aps.org/4uPmZzO
7
17
1,710
Mikhail Shugay retweeted
Our comment on the urgent need for decentralized scientific databases, particularly in today’s unstable geopolitics, is now online in @NatureGenet. Overdependence on a single authority can jeopardize global scientific access, resilience & continuity. Link: nature.com/articles/s41588-0…
2
24
69
4,339
Mikhail Shugay retweeted
Our consortium paper on deep profiling of individuals of different ethnicities located across continents just came out. I HPP initiative involving many different labs from aorund the world and an amazing open access resource. Full of cool results. cell.com/cell/fulltext/S0092…
2
11
35
5,075
We are excited to announce that our paper "Cancer epitope prediction tools and analysis pipelines in CEDAR" is now published in the Nucleic Acids Research (NAR) Web Server issue! doi.org/10.1093/nar/gkag457 #Immunology #ImmunoOncology #CancerResearch #Bioinformatics
2
4
148
Mikhail Shugay retweeted
Deep peptide recognition profiling decodes TCR specificity and enables disease-associated antigen discovery go.nature.com/3QWMRet
2
29
133
9,371
Mikhail Shugay retweeted
A novel metric reveals previously unrecognized distortion in dimensionality reduction of scRNA-Seq data biorxiv.org/content/10.1101/…
2
14
68
6,251
Any TCR-T must be both safe and effective, and while rushing for potent TCRs that recognise neoantigens, care should be taken to avoid recognising healthy "self". So, how to properly design a tumour-reactive TCR? McCarthy et al. "Reverse engineering the fatally cross-reactive A3A TCR to decouple potency and specificity" doi.org/10.64898/2026.05.03.…
1
11
58
2,889
Mikhail Shugay retweeted
Whole-protein screening and multi-modal profiling of antigen-specific CD4 T cells at single-cell resolution @NatureComms @LabHeath @UW nature.com/articles/s41467-0…
14
66
4,176
Mikhail Shugay retweeted
Exciting new work from Aaron Bodanksy and Joe DeRisi and colleagues on finding antigen-specific B cells and T cells in ROHHAD, a rare pediatric neuroendocrine disorder biorxiv.org/content/10.64898…

1
4
438
Mikhail Shugay retweeted
Machine learning model decodes immune recognition; KIRLinguist predicts KIR-HLA interactions, guiding personalized infection and cancer therapy science.org/doi/10.1126/scia… @ScienceAdvances @OhioState @adesoton @Abdallah_A_OSU
1
4
17
1,367
Mikhail Shugay retweeted
Replying to @biorxivpreprint
@biorxivpreprint High-resolution single-cell atlas of the human B cell compartment and immune microenvironment across tissues biorxiv.org/content/10.64898… @panhammarstrom @karolinskainst 🇸🇪🇨🇳
5
25
1,351
Mikhail Shugay retweeted
Stress-testing meta-learning for T cell receptor binding T cell receptors (TCRs) are the molecular fingerprints our immune system uses to recognize peptides on infected or tumor cells. Predicting which TCR binds which peptide matters for neoantigen vaccines, cancer immunotherapy, and diagnostics, but the data are brutally long-tailed: a few peptides have thousands of known binders, most have a handful or none. PanPep tackled this in 2023 with an elegant idea. Instead of one classifier, train a meta-learner across peptide-specific tasks, then adapt it via majority, few-shot, or zero-shot regimes, with a neural Turing machine for the no-data case. It was the first model in the field to attack the long tail with meta-learning rather than supervised training. Fei He and coauthors now deliver a careful reusability audit, more nuanced than the original benchmarks. They reproduce the numbers and push further with unseen peptides and TCRs, virtual screening over a 57-million TCR repertoire, two negative sampling strategies, and extensions to TCRα and TCRαβ chains. PanPep generalizes better than competing tools to unseen peptides with few or no binders, but only when negatives are random draws from the background. With reshuffled negatives built by permuting real binders, performance collapses to near random. The model leans on memorized TCR patterns rather than true peptide-TCR compatibility. On strict unseen-peptide, unseen-TCR pairs every method drops to chance, top 0.1% enrichment stays low, and zero-shot distillation underperforms the meta-learner, a clear case of catastrophic forgetting. Suggested fixes are interesting: virtual screening as primary evaluation, hybrid negative sampling, and protein foundation models with parameter-efficient fine-tuning instead of task-by-task meta-training. For immuno-oncology and vaccine design, the message is direct. Meta-learning helps prioritize candidates when only a few binders are known, but balanced-set scores can give a false sense of readiness. Pipelines should run full-repertoire screening with reshuffled negatives and treat unseen-peptide unseen-TCR performance as a hard requirement. Paper: He et al., Nature Machine Intelligence (2026) | journal license nature.com/articles/s42256-0…
3
24
2,382
Mikhail Shugay retweeted
Review @AnnualReviews @teichlab @Cambridge_Uni Decoding Human T Cell Immunity with Artificial Intelligence and Single-Cell Genomics annualreviews.org/content/jo…
1
18
66
4,498
Mikhail Shugay retweeted
⚠️ If you’re reading this, you’ve been infected* ⚠️ *~95% the human population has been infected by the Epstein-Barr Virus (EBV). Today in @Nature with @nyeo_sherry, @EMC22381830, @RyanDhindsa @SlavePetrovski, we shed some light on what happens next. nature.com/articles/s41586-0…
23
134
489
107,131
Mikhail Shugay retweeted
Data show that ellagic acid (EA) is a novel dietary MR1 ligand capable of inhibiting mucosal-associated invariant T cell activation, indicating orally administered EA as a potential therapeutic for inflammatory diseases of the gastrointestinal tract: ow.ly/aUyb50YNAGF.
7
16
1,406