director of applied AI @TempusAI; prev: faculty @DanaFarber, group leader @Harvard & phd @eth

Joined January 2016
589 Photos and videos
Pinned Tweet
scRNAseq cell type annotation is notoriously messy. Despite so many algorithms, most researchers still rely on manual annotations using marker genes In a new preprint accepted at ICML GenAI Bio Workshop, we ask if reasoning LLMs (DeepSeek-R1) can help with cell type annotation🧵
7
38
200
25,928
Simona Cristea retweeted
Big progress vs cancer, folks. The kind of event curves from randomized trials that we've not seen before for a couple of the most deadly cancers. Congrats to the oncology research community for getting these trial done. #ASCO26, @ASCO
36
489
2,450
124,274
hard to internaliza this now because we are so attuned to the present, but he is right
Replying to @t_blom
This problem will naturally tend to go away as companies are grown from the start using AI. Then you don't need to extract any domain knowledge from people's heads; it will never have been in people's heads.
1
578
there aren't many times in oncology when nobody cares about statistics, but today is one of them. there has never been such a successful trial in pancreatic cancer & these survival curves are the result of 40 years of persistence. KRAS inhibitors will forever transform oncology
🌟This is history ⭐️The most awaited abstract 👏 Standing ovation at Hall B1 💊 Daraxonrasib becomes the new standard of care for patients with previously treated metastatic #pancreatic #cancer #ASCO26
1
10
100
6,242
Simona Cristea retweeted
Incredible #ASCO26 moment. Dr. Brian Wolpin, presenter of the daraxonrasib study, received a standing ovation DURING his talk after he stated the survival benefit for PDAC patients. It was sustained. Cheering. I have never see anything like it in the middle of a talk. $RVMD
16
136
990
154,184
alphaevolve has the highest potential to transform science as a whole, across fields: bio, materials, psychology etc
Algorithms are part of nearly every aspect of life, from the physics of the natural world to planning shipping routes. Our Gemini-powered coding agent AlphaEvolve has been accelerating progress over the last year - from quantum and biotechnology to logistics and @Google’s AI infrastructure. ↓ goo.gle/4uzfe0C
10
53
8,267
wow as of may 2025, there are 513 cancer vaccines in development, with 33 in phase 3 @NatRevDrugDisc
4
31
123
13,377
this is the first truly impressive comp bio AI-only analysis that I’ve seen. this is truly useful
As I mentioned before, I am now sharing an example from GPT-5.5 Pro, also featured by OpenAI, that really left me stunned by what it is capable of in biomedical science. (full report on the website I created with Codex, link in the thread). To push GPT-5.5 Pro hard, I uploaded a real data set of immune subset (T cells) gene-expression spreadsheet: 62 sorted T cell samples, 27,906 gene columns, and millions of underlying data points across different T cell subsets. Importantly, this public dataset also had paired structure making it possible to separate true cell-state biology from donor-to-donor variation. I asked GPT-5.5 Pro not merely to summarize the spreadsheet, but to analyze it deeply: What can we learn from this dataset? What are the mechanistic insights? What are the most important biological questions that emerge? What follow-up experiments should we do next? It thought for about 100 minutes and produced a roughly 40-page report! What amazed me was not just the length or even the initial analysis, since previous models are also capable of doing this. What amazed me was the quality of the reasoning and insights it provided! The report recognized that this was not just a table of genes, but two overlapping experimental designs. It identified the major biological axis, which in plain language was that the cells were not just “different categories.” They formed a coherent differentiation landscape, moving from future potential toward immediate function. It also understood the caveats. It did not overclaim from bulk gene-expression data. It clearly explained that bulk transcriptomics cannot distinguish whether every cell in a sorted population has shifted or whether a smaller subpopulation is dominating the signal. It recommended the right next steps experiments, and integration with donor metadata. This is what made the report feel so special to me. It was not just doing statistics. It was reasoning like an expert systems immunologist. It saw the structure of the experiment, interpreted the patterns, built a mechanistic model, identified limitations, proposed causal hypotheses, and laid out a translational roadmap. Other advanced models have been able to generate excellent biomedical reports before, including previous GPT-5 models. So I don't want to claim this is an entirely new type of capability. But this one felt different in an important way. It had more scientific elegance, more restraint, more biological intuition, and more of the nuanced judgment that usually comes only from years of hands-on experience in the field. It felt like this AI model had crossed another threshold. This is the kind of analysis that could easily take a research team months to perform, refine, interpret, and write up. Even then, many teams might not produce something this integrated, this mechanistically coherent, and this useful as a launchpad for future experiments. I know a 40-page T-cell gene-expression analysis may not be exciting to everyone. To illustrate how good it is, also had Codex built a web site with it anyone can explore, link below. 😊 Those interested can go deeper into the report. I also wanted this example on the record because, because to me, it is evidence that we are entering a new stage in AI-assisted biomedical science. The important point is no longer that AI can "analyze data and write a report.” The important point is that AI can now help transform complex biological data into mechanistic understanding, experimental priorities, and testable hypotheses at a speed and depth that would have been almost unimaginable a short time ago. For biomedical science, this is a very big deal! Of course, this may vary across domains, and every analysis still needs expert review, validation, and experimental follow-up. But in my own field, with data I understand deeply, this felt like another inflection point. I feel strongly that we have crossed another milestone threshold in the age of AI, with the release of GPT-5.5.
2
3
46
10,623
Simona Cristea retweeted
There's a fourth possibility: humans only appear sample efficient because they've effectively seen a massive amount of data through evolution. Remember, there is a fluidity between the model and the data. The model is a representation of our understanding of data.
There's a quadrillion-dollar question at the heart of AI: Why are humans so much more sample efficient compared to LLM? There are three possible answers: 1. Architecture and hyperparameters (aka transformer vs whatever ‘algo’ cortical columns are implementing) 2. Learning rule (backprop vs whatever brain is doing) 3. Reward function @AdamMarblestone believes the answer is the reward function. ML likes to use pretty simple loss functions, like cross-entropy. These are easy to work with. But they might be too simple for sample-efficient learning. Adam thinks that, in humans, the large number of highly specialised cells in the ‘lizard brain’ might actually be encoding information for sophisticated loss functions, used for ‘training’ in the more sophisticated areas like the cortex and amygdala. Like: the human genome is barely 3 gigabytes (compare that to the TBs of parameters that encode frontier LLM weights). So how can it include all the information necessary to build highly intelligent learners? Well, if the key to sample-efficient learning resides in the loss function, even very complicated loss functions can still be expressed in a couple hundred lines of Python code.
55
34
444
45,118
I learned so much from Brian, he is such an empathetic & knowledgable oncologist, and a great leader. Brian’s research spans the whole spectrum of pancreatic cancer efforts, from prevention, diagnosis, up to late-stage treatment; this work here is only one piece of the puzzle 👏
Pancreatic cancer research at #AACR26: Dr. Brian Wolpin of @DanaFarber_Hale presents encouraging data on safety and efficacy from a small study combining the RAS inhibitor daraxonrasib with chemotherapy in patients with advanced #PancreaticCancer. @danafarber ➡️bit.ly/4cAoXMU
2
6
15
3,378
interesting word cloud then&now, makes me reflect on how there’s less focus now on subclonal reconstruction than 5y ago. while unfortunate that tumor progression is not that researched anymore, maybe that’s progress: we accepted subclonality & now need to focus more on therapies
✈️ back from #AACR26 inspired and grateful. I haven’t missed an AACR Annual Meeting since 2012 and it remains my favorite. Grateful for the chance to speak and for the highly engaged audience at our team's talks and posters. Joy to reconnect with friends, colleagues and new faces
1
7
1,486
Simona Cristea retweeted
Feel free to throw your hardest structural biology problems at Opus 4.7, and please share feedback so we can keep improving!
26
24
335
26,140
KRAS inhibition is forever changing treatment options for one of the worst diseases of all time. i feel incredibly honored to have worked on the spatial genomics of KRAS inhibition in patients with former colleagues at @DanaFarber_Hale & together with @RevMedicines 👏👏 RevMed
Unprecedented overall survival benefit from Phase 3 RASolute 302 in previously treated metastatic pancreatic cancer. Read the press release: ir.revmed.com/news-releases/… #Oncology #PancreaticCancer
2
22
2,097
“It means the locus of value starts moving upward, from manually executing every step to defining the problem, choosing the right objective, recognizing failure modes, and deciding what should be built and validated in the real world.”
Replying to @samsinai
Blog post: open.substack.com/pub/dynotx… Claude Mythos Preview System Card: www-cdn.anthropic.com/08ab91…
1
2
462
Simona Cristea retweeted
People often ask how breakthroughs occur in cancer biology-often the story is more complex - the survival plot for myeloma outcomes is extraordinary - improvements come about in incremental steps - in my lifetime treatment of Myeloma has almost transformed into a curable disease
25
260
1,011
204,211
Simona Cristea retweeted

57
58
796
344,459
Simona Cristea retweeted
In fact, my current default is to work through a problem with an LLM-code agent to identify these failure points. Then wipe everything and start from scratch with a new agent, where the prompt now explains all of the pitfalls I saw the first agent making.
3
2
28
3,160
before chatGPT, people sounded smart if they used complicated words. now, you sound smart when you can communicate with simple words.
4
16
1,802
Simona Cristea retweeted
people freak out how kids today cheat with AI in college exams. but there’s a solution to it, done by Math Universities for centuries: oral exams. prepare to answer questions about a topic & explain what about it is obvious & what is tricky. guaranteed to teach kids how to think
2
34
3,182
Simona Cristea retweeted
The time of day for cancer immunotherapy is associated with major outcomes. Early is better. Results from a randomized trial of lung cancer, backs up the importance of our circadian rhythm and immune system nature.com/articles/s41591-0…
48
327
1,374
544,273