Prof @UofT | Building first Virtual Cell @Xaira_Thera | Chief AI Scientist @UHN | AI & Bio & Healthcare | Inventor of scGPT, MedSAM, BioReason | Opinions my own

Joined September 2016
365 Photos and videos
Pinned Tweet
Let’s discuss the scaling law of virtual cells. A Nature Methods paper (nature.com/articles/s41592-0…) published yesterday is being interpreted by some as evidence that scaling laws do not hold for virtual cells. I read it in detail, and here are my 2 cents: It is a useful benchmark, but not a direct test of scaling laws in causal single-cell foundation models or perturbation-native virtual cells. The paper mainly studies PCA, scVI, Geneformer, and SCimilarity (which are relative small models) on observational atlas-style pretraining, with perturbation evaluation limited to a narrow Tahoe small-molecule/cancer-cell-line setting. These are important baselines on scFMs that focuses on learning cell embeddings, but they are not large-scale causal perturbation models (e.g, diffusion-based virtual cells, or other modern architectures designed natively for causal perturbation biology). The metrics also matter. Cell-type F1 and batch-integration AvgBIO are reasonable atlas/embedding metrics, but they are also tasks that can saturate quickly. They are not direct measures of causal perturbation prediction, target ranking, rare differential-expression tails, OOD genetic perturbations, or generalization across biological contexts. The “learning saturation point” in the paper is useful, but it is not really a scaling law. It asks: what is the smallest pretraining size that reaches within 95% of the best observed score on this benchmark? That is a helpful diagnostic, but it can be overinterpreted when the downstream task itself is saturated. The perturbation result is, IMO, limited: a few selected Tahoe-100M small molecules across several cancer cell lines, evaluated with genewise R²/MSE. The paper itself reports that a “no-change” baseline beats fine-tuned models for most drugs, which says as much about the evaluation regime as about model scaling. In fact, our scGPT work already showed three years ago that simply scaling the number of observational cells saturates quickly after a few million cells. So I agree with the warning: naive “more atlas cells = better virtual cell” is not enough. But that is not the real scaling question. In X-Cell, we study scaling across multiple axes: number of perturbation cells, number of biological contexts, perturbation diversity, and model parameters. On our Perturb-seq-scale data, we observe clear and encouraging scaling behavior. Similar trends are emerging from other perturbation-native virtual cell efforts as well. The important question is not: can more atlas cells improve cell-type F1? It is: with larger Perturb-seq datasets, larger models, better architectures, and harder OOD splits, can we predict causal cellular responses across genes, combinations, doses, cell states, and contexts? For X-Cell and the next generation of virtual-cell models, the goal is not just better embeddings. It is target ranking, rare DE tails, counterfactual biology, and prospective perturbation prediction. So my reading is: this paper is a useful caution against naive scaling, not evidence that scaling laws do not apply to virtual cells. The exciting regime is still open: scaling the right data, the right models, and the right objectives for causal cellular biology.
2
20
135
16,078
🇨🇦🇨🇦🇨🇦🔥🔥🔥
Beautiful scene. More of this spirit and pride pls 🇨🇦
6
1,855
Bo Wang retweeted
Jeff Bezos and Vik Bajaj give a first peek at Prometheus, an ARCH co-founded company to create an artificial general engineer. ARCH’s largest investment ever, from seed formation.
Jun 11
CNBC's David Faber sits down for an exclusive interview with Prometheus co-founders and co-CEOs Jeff Bezos and Vik Bajaj. Tune in to CNBC to watch live and follow this thread for updates. ⬇️ cnbc.com/2026/06/11/project-…
13
72
18,849
great thread about the most fundamental steps in single cell data analysis! 🙏
Arguably the most boring step in genomics is the first one: normalization. Settled science. Scale log. Move on. Except that here's been a huge blind spot in the field. And it matters for AIxBio. A 🧵about what I think may be one of the most important papers I've written. 1/
2
6
37
7,710
"I thought that by the end of the century was a stretch. Now I think it's too conservative." 🚀🔥🚀🔥
Mark Zuckerberg wanted to cure, prevent, and manage all diseases by the end of the century. He and Priscilla then had a series of meetings where Nobel Prize-winning scientists laughed at them. Now Zuckerberg says, "I thought that by the end of the century was a stretch. Now I think it's too conservative." Full episode linked in replies.
4
6
62
10,994
Congratulations ! @AmolAVerma & @DrFahadRazak This new initiative will provide huge opportunities for AI & Health in Canada! 🇨🇦
Proud moment! 👏👏👏VITAL, a health data platform based at St. Michael’s, is part of the 🇨🇦 AI Strategy shared today by PM @MarkJCarney. It’s receiving $210M in federal, provincial & institutional funding – a landmark investment. unityhealth.to/2026/06/vital…
2
2
27
4,567
Bo Wang retweeted
biology will deliver most of the returns in the stock market over the next 20 years
18
37
314
30,712
Bo Wang retweeted
This is one of the most important parts of the Mythos 5 announcement: Drug design. Anthropic says its internal protein design experts used Mythos 5 to accelerate parts of the drug design process by around 10x. In one example, Mythos 5 used protein design and bioinformatics tools without (!) human assistance and matched or beat skilled human operators. It means the model could run parts of the early drug discovery loop itself: choosing binding sites, selecting tools, running protein design workflows, recovering from failures, and generating promising candidates. And here is what people still dont understand outside of our community: AI is moving from "assistant that explains science" to "agent that can actually execute parts of scientific work." The future of drug discovery may look less like one scientist manually testing ideas one by one, and more like thousands of AI-driven research loops running in parallel, with humans validating, interpreting, and deciding what actually moves forward. I keep thinking about what Demis Hassabis said: we are entering the golden age of science. There is probably no doubt about that anymore. Medicines are being developed faster and more precisely, and previously incurable diseases are becoming treatable.
It's already June 9th, and Gemini 3.5 Pro and GPT-5.6 are nearing release (Google even already announced 3.5 Pro during i/o) Rumor has it that GPT-5.6 will be released as early as next week. So far, it's safe to say that - guardrails aside - Anthropic is truly the frontier lab that's entering a new league with Mythos/Fable. Gemini 3.5 Pro and GPT-5.6 have a lot to deliver and are now under pressure. This release has certainly boosted Anthropic's upcoming IPO. Anthropic has proven that they are still capable of making significant leaps in performance and efficiency. There's no end in sight. But the pressure on the competition is mounting. And remember that Claude Mythos was (and probably is) still leader in Long Horizon software Tasks
28
50
480
46,905
As a frequent customer of air Canada, I have mixed feelings about this news 😂🇨🇦
Jun 10
A former Air Canada pilot faces criminal charges for flying tens of thousands of passengers for nearly 17 years with a fake pilot's license, Canadian police announced. cnn.it/4uoLTFu
1
10
3,155
If you are interested in knowing the latest AI in medical imaging research, @JunMa_AI4Health is a must follow!
Had a wonderful time at @CVPR 2026, so many thoughtful conversations and inspiring work. Here are six medical vision papers I'd like to share, one per core task: 1. Segmentation: VoxTell (DKFZ). Universal 3D segmentation driven by free text, from a single word to a full clinical sentence. Trained on 62K CT/MRI/PET volumes; the key idea is fusing text into the decoder at every scale, not just at the end. 2. Classification: AnyMC3D (Siemens Healthineers). Instead of new 3D foundation models, it adapts frozen 2D ones with ~1M params/task. Careful study of three recurring pitfalls in the field: data-regime bias, weak adaptation, and too-narrow task coverage. 3. Registration: SGDIR (U Alberta). Diffeomorphic registration using a single semigroup regularizer, with a proof that it makes the network learn an ODE flow, so invertibility and cycle consistency come for free, no scaling-and-squaring. 4. Reconstruction: Efficient Unrolled Networks (CNRS/ENS Lyon). Two ideas (domain partitioning plus a normal-operator approximation via FFT) let you train operator-aware unrolled networks for 3D CBCT and multi-coil MRI on a single GPU. 5. Medical VLM: Medic-AD (SNU/Samsung/NVIDIA). A stage-wise design that turns broad VLM knowledge into three concrete clinical skills: anomaly detection, longitudinal symptom tracking, and grounded heatmap explanations. 6. Agent: OralGPT-Plus (HKUST-GZ/HKU/PKU). An agentic VLM for panoramic dental X-rays that learns to "zoom in" and "mirror in" (compare symmetric teeth), trained with reinspection-driven RL. A short series of notes follows. Corrections welcome, I'm sharing to learn, not to rank😀
3
5
15
3,135
Congrats @gelarehzadeh ! This is a great work about how spatial single cell omics can help cancer research!
1
1
30
5,025
Is biology fundamentally harder than vision or coding? Anthropic ran frontier models on one task: retrieve viral sequences from NCBI. Same query, three runs. Claude Sonnet 4 returned 106, 15, then 5 sequences. Ground truth: 266. One run estimated an Ebola outbreak origin as 1922. The fix wasn't a better model , but a thin deterministic wrapper (gget) hit ~100%. Bio databases were built for humans clicking browsers. Filtering logic lives in web UIs, metadata is inconsistent, identifiers drift between sources. No LLM fixes broken pipes. NCBI has 30 databases that need this treatment. That work hasn't started yet. We will soon release more results on how frontier agentic workflow can work with biological database including large-scale perturb-seq data at @Xaira_Thera . Stay tuned 🙏😁
New Science Blog: Why has AI advanced faster in coding than in biology? To agents, bio databases are like cities built before cars—maddening to drive in because they're designed for different traffic. How do we build infrastructure agents can use? anthropic.com/research/agent…
25
46
322
53,492
Canada’s AI strategy: AI for all!
Nous venons de lancer la nouvelle stratégie du Canada en matière d’intelligence artificielle : L’IA pour tous.   Nous prenons le contrôle de notre avenir – au moyen d’une IA régie par les valeurs canadiennes, responsable et au service de tous les Canadiens.
24
4
58
11,284
Bo Wang retweeted
Re-tweeting. We need rapid policies that make USA companies more competitive and create a market mechanism that preserves patient options, while preventing harm to biotech innovation cycle.
We need an Innovation Review Voucher, where a USA innovator (not others) can apply, based on their innovation in a drug category or molecule, for super fast regulatory approval, AND a transferable and subdividable 3 year commercial exclusivity, which covers the structure and close derivatives. Applications would be 90 day decisions, with a standard that a reasonable person would conclude that the technology was used or copied by the entity. In practice, this should prevent USA innovation from being decimated via less funding flowing in (which hurts patients in the long run), and if the commercial delay is critically important the vouchers are tradeable, and can be granted to more than one party at the discretion of the holder, so patients are protected. Prob have like ARPA H like structure or some new DOC panel to do it.
4
9
54
13,517
Our workshop of Foundation models for medical vision at CVPR 2026 is happening on June 3rd! Four keynotes spanning oncology, pathology, virtual patients, and agentic systems, plus challenge winners breaking down their solutions. If you're at CVPR, this is the room to be in 🔥
Please join us on June 3rd in Room 607 for the #CVPR2026 Foundation Models for Medical Vision (FMV) workshop! Four world-leading keynote speakers, @jnkath, @AI4Pathology, @hoifungpoon, and @pranavrajpurkar will share cutting-edge AI models across oncology, pathology, virtual patients, and agentic systems. Plus: winners of our medical image foundation model challenges will give highlights on their solutions. 🔗 fmv-cvpr26workshop.github.io… @yuyinzhou_cs @vishalm_patel @BoWang87 @CVPR
1
6
21
5,720
This is just incredible! 🔥
One of the most amazing things I’ve ever seen: a standing ovation for the full Daraxonrasib results I feel inspired and energised, to put it mildly — we have a targeted therapy for pancreatic cancer now, and nothing is undruggable anymore
3
6
96
10,142
Today at #ASCO26, more results about the newest clinical trial of daraxonrasib: In the Phase III RASolute-302 trial, a once-daily RAS(ON) inhibitor nearly doubled median overall survival (13.2 vs 6.7 months) and reduced the risk of death by ~60% versus chemotherapy in previously treated metastatic pancreatic cancer. KRAS was once considered “undruggable.” This is what persistence in science looks like. A landmark moment for targeted therapy and for patients who desperately need better options. #CancerResearch #PancreaticCancer
Pancreatic cancer has one of the most suppressive tumor microenvironments in oncology. But two pancreatic cancer results dropped today. Both matter. 1. BioNTech mRNA neoantigen vaccine: nearly all responders still alive at 6 years. 98% of induced T cells were de novo — the immune system learned to see a cancer it had always been blind to. 2. Daraxonrasib: 47% ORR, 92% disease control as first-line monotherapy. KRAS G12D, undruggable for 40 years, finally has a drug. Different mechanisms. Same disease. Both working. <13% of patients survive 5 years. That number is about to change. great day for science! 🔥
2
9
54
9,669
Bo Wang retweeted
Advanced world knowledge: x.com/BoWang87/status/202707…

One prompt: "generate a high resolution image about a cell." This is what Nano Banana 2 (@GoogleDeepMind) rendered. Nucleus, mitochondria, Golgi apparatus, cytoskeleton — all anatomically correct, all stunning.
1
3
28
10,310
This is not looking good for Canada. The irony is that Canada helped pioneer breakthroughs behind AI and GLP-1s, two industries now creating trillions of dollars in value, yet much of that value is captured elsewhere. Canada produces world-class science, but we continue to struggle with commercialization at scale. Too little growth capital, insufficient compute infrastructure, and too few pathways to build globally competitive companies. We need to invest across the full stack: talent, capital, compute, regulation, and company-building, not just research grants. 🇨🇦
JUST IN: 🇨🇦 Canada officially enters a technical recession.
56
54
442
85,207
Opus 4.8 is almost as good as Claude Mythos in most of biology benchmarks! 🔥🔥
4
26
181
14,632