Allen Li MD

Allen Li MD

8 Photos and videos

Tweets

Pinned Tweet

Allen Li MD

@LiAllenMD

Apr 5

Community oncologist. I use AI to appraise clinical trials, then verify every finding against the primary data. The AI gets graded too. One trial, two readers, one bottom line. The Source Report on YouTube (in bio) and Substack. allenlimd.substack.com #onced #oncology #meded

818

Allen Li MD

Allen Li MD retweeted

Allen Li MD

@LiAllenMD

15h

Replying to @EricTopol @EvidenceOpen @UpToDate

Competition here is healthy. Credit first: independent study, public code, and the main benchmark uses real physician questions rated by blinded clinicians. That part is strong. Three issues. 1️⃣ (Others have flagged this.) The two sides were tested differently. Frontier models: API, temperature zero, search on. Clinical tools: queried by hand in their browsers, with hidden prompts and retrieval the authors could not control. They say so. And search gave the frontier models retrieval, the thing clinical tools are built on. Not specialized vs general. Two systems with retrieval, different doors. 2️⃣ The weight rests on one benchmark, not three. MedQA and HealthBench carry contamination risk. HealthBench was built by OpenAI, its model won it, and answers were graded by the same three models being scored. The authors call the clinician rated benchmark primary, HealthBench supplementary. On MedQA, Claude tied both clinical tools. 3️⃣ Outperform means quality, not safety. No model was safer than another. The highest harmful response rate was a frontier model. The floor the clinical tools missed was a free search feature, not only purpose built tools.

142

Allen Li MD

Allen Li MD

@LiAllenMD

19h

FDA approved Truqap (capivasertib) abiraterone for PTEN-deficient mHSPC. CAPItello281 hit its primary endpoint: 7.5 months added rPFS (HR 0.81). A real signal. In it, 74% had high-volume disease, about a quarter with visceral mets, exactly where triplet is NCCN-preferred. This trial had no chemo arm to compare the two. The chemo triplet trials before it, PEACE-1 and ARASENS, both showed an OS benefit. CAPItello281 has not shown one yet (HR 0.90). New Source Report: youtube.com/shorts/-H6isX04e… @FDAOncology @myESMO @ASCO @oncoalert #ProstateCancer #GUOnc #Oncology #PrecisionMedicine #capivasertib #CAPItello281

FDA approved Truqap (capivasertib) for prostate cancer: does it improve survival? #ProstateCancer

On June 12, 2026, FDA approved capivasertib plus abiraterone for PT...

youtube.com

237

Allen Li MD

Allen Li MD

@LiAllenMD

Jun 11

The case for AI in oncology is real: it holds every molecular finding, every trial, every approved option, all at once. No human can. But that was never the hard part. The hard part is knowing which of them is right for the patient in front of you. Whether a model ever masters that is the real question. Right now, it’s still ours.

Allen Li MD

Allen Li MD

@LiAllenMD

Jun 11

open.substack.com/pub/allenl…

Allen Li MD

Allen Li MD

@LiAllenMD

Jun 7

PROTEUS in NEJM: perioperative apalutamide reports a metastasis free survival win in high-risk prostate cancer, HR 0.80. The catch: as many has already pointed out, the endpoint was redefined mid-trial to add PSMA-PET. MFS by PSMA-PET is not yet a validated surrogate for survival. The version that is validated, conventional imaging, was not significant (HR 0.84). And OS so far favors placebo (HR 1.08, very immature). youtube.com/shorts/xmTPQogWK… #MedOnc #ProstateCancer #PROTEUS #EvidenceBasedMedicine #ASCO2026 #MedOnc #Oncology

PROTEUS: A Prostate Cancer Win on Metastasis, Not Yet on Survival...

PROTEUS met its metastasis endpoint, but the survival validated ver...

youtube.com

294

Allen Li MD

Allen Li MD

@LiAllenMD

Jun 2

In NEJM: daraxonrasib is the first RAS-targeted drug to extend survival in pancreatic cancer. In 2nd-line RAS G12 disease it roughly doubled median OS, 6.6 to 13.2 months (HR 0.40). A real win. Three things the abstract doesn’t foreground ↓ #MedOnc #PancreaticCancer #RASolute302 #EBM youtube.com/shorts/VgM_9sk-0…

Daraxonrasib in Pancreatic Cancer: The Win, The Fine Print, and The...

Daraxonrasib is the first RAS-targeted drug to extend survival in p...

youtube.com

111

Allen Li MD

Allen Li MD

@LiAllenMD

May 29

Google’s healthcare AI, AMIE, “beat” medicine trainees and oncology fellows on breast cancer cases. Look closer: the items where AMIE scored highest had rubrics identical to the AI’s own prompt, word for word. It’s like grading a fellow on whether they followed a notecard you handed them. That measures instruction-following, not clinical judgment. (This is the preprint. Now published in NEJM AI with a larger dataset, which I’m checking next.) youtube.com/shorts/3jjBjvPoN… #MedTwitter #AIinMedicine #ClinicalAI #BreastCancer

This AI "Beat" Doctors on Cancer Cases. Here's the Catch.

Google posted a preprint on the performance of its healthcare AI, A...

youtube.com

633

Allen Li MD

Allen Li MD

@LiAllenMD

May 22

In Nature Medicine: Google DeepMind’s multimodal AI read complete heart block as normal sinus rhythm. Internal log: no evidence the image was processed. Main paper Fig 2c: hallucination not significant. Supplement says different.👇 youtube.com/shorts/cLB8CktLg… #AIinMedicine #PatientSafety #ClinicalAI #Cardiology #MedTwitter #AILiteracy @VincentRK @HemOncFellows @OncBrothers @DrArturoAI @montypal @operationdanish @Papa_Heme @EricTopol @DrRishabhOnco @OncoAlert @OncoReporte @Larvol @OncologyBGLab @JavierDavidBen2 @csoncol @Timothee_MD @JCOOP_ASCO @TwoOncDocs @FCademartiri @doctorbhargav

The Heart Block Google's AI Missed. Read the Supplement.

The internal reasoning log contained no evidence that the ECG image...

youtube.com

464

Allen Li MD

Allen Li MD

@LiAllenMD

May 22

Credit to the authors for including this in the supplement. It is this kind of academic integrity that will move the field of AI in medicine forward.

Allen Li MD

Allen Li MD

@LiAllenMD

May 22

With the FDA approval today of Dato-dxd based on Tropion Breast 02 for mTNBC, it is worth revisiting. It’s a good option for mTNBC. One important point is that the OS benefit actually is regional dependent. In the US/Canada/Europe subgroup the HR is actually reversed!👇 youtube.com/shorts/fqJUehEGc… #OncTwitter #bcsm #datodxd #tropionbreast02

TROPION-Breast02: What the Abstract Does Not Tell You | The Source...

TROPION-Breast02. Dato-DXd vs chemotherapy. First line metastatic T...

youtube.com

210

Allen Li MD

Allen Li MD

@LiAllenMD

May 17

I believe AI will transform medicine for the better, and in many ways, it already has. But we won’t get there by cheerleading. We’ll get there by being honest about where these tools fall short today.

111

Allen Li MD

Allen Li MD

@LiAllenMD

May 16

Liability issue aside, calling AI in oncology supporting roles “lower risk” may be an underestimate. Even a decision as routine as IV hydration between chemo can be consequential. Is the patient dehydrated or fluid overloaded? Do cardiac or renal comorbidities tip the calculus? A web interface, AI or human, often cannot see what is needed to decide well. Trust gets built the way it always has in medicine: prospective evaluation, prespecified endpoints, honest reporting of where the tool fails. One can simultaneously believe in the power and promise of AI, be skeptical and critical of its limits today, and hope to inform its potential for tomorrow. Agree that early engagement from the actual care team will be key. ascopost.com/issues/may-10-2…

Could AI Be Licensed to Practice Oncology?

Is artificial intelligence (AI) poised to practice medicine? It may be already. Earlier this year, the state of Utah allowed Doctronic, a health technology company using AI to make clinical decisions...

ascopost.com

136

Allen Li MD

Allen Li MD

@LiAllenMD

May 16

Here is the paradox. If physicians refuse to give up any control, AI’s clinical role gets defined by everyone except clinicians: management, payers, vendors. If physicians give up control before trust is built, patients bear the risk. Trust needs data. Data alone may not be enough without personal experience. Personal experience requires giving up some control. So how does a clinician earn the experience needed to build the trust, without first giving up the control that experience requires? There is a small precedent here. When fax machines arrived, people did not trust that messages actually landed. The printed receipt bridged the gap. It let users build experience with the new tool in a way they could audit, until the receipt itself became unnecessary. AI may need its own version of that artifact: outputs that surface uncertainty and let clinicians check the reasoning, so the experience needed to earn trust can accumulate without giving up control blindly.

Allen Li MD

Allen Li MD

@LiAllenMD

May 16

Side note: even today, I do not completely trust the fax machine, especially when it is being sent by the all-in-one printer/copier/scanner that takes up the whole corner of the clinic office. Maybe this says more about me than the fax machine.

Allen Li MD

Allen Li MD

@LiAllenMD

May 15

The coverage of the Science paper claims AI beats doctors in clinical reasoning. We need to be more critical of what “clinical reasoning“ means in this publication. Take a look at how the AI “beats”physician in this Science paper. AI is getting better every day and will be an important part of medicine. However, what this paper may have shown is that AI is better at generating a list of things instead of frank clinical reasoning compared to physicians. youtube.com/shorts/Io9aFmZbL… #OncTwitter #AIinMedicine #EvidenceBasedMedicine

How Did the AI "Beat" Doctors at Clinical Reasoning? (Science Paper)

In April 2026, Brodeur et al. published a paper in Science testing ...

youtube.com

3,101

Allen Li MD

Allen Li MD

@LiAllenMD

May 13

The Brodeur Science paper has been the loudest AI-in-medicine story of the week. The eLetter version of my thought is now up at Science. Three methodological concerns: 1. Rubric structure rewards listing items. The Grey Matters Q1 rubric is purely additive: 19 points across 22 line items, no penalty for excess or wrong tests. AI lists everything; physicians write focused notes. The 89-vs-34 headline measures rubric enumeration. 2. Information gradient. In the paper's head-to-head ED experiment, AI's edge concentrates at triage with sparse data (67% vs 50–55%). By admission with the full workup, AI 81.6% vs Physician 1 78.9% — no longer statistically significant. Same patients, same model, same physicians; only information level changes. 3. Historical comparators. Five of the six experiments compare AI in 2024–2025 against physicians scored on different cases, by different graders, in earlier publications. The 55-percentage-point gap on Grey Matters cannot be cleanly attributed to model superiority. TL;DR: AI is better at listing things on a checklist. The Brodeur paper measures that very well. Whether it translates to better patient care is a different question. Headline-only readers are most at risk of being replaced by AI. Video: youtu.be/Rl2pJUwuTk0 eLetter at Science: science.org/doi/10.1126/scie… @VincentRK @HemOncFellows @OncBrothers @DrArturoAI @montypal @operationdanish @Papa_Heme @EricTopol @DrRishabhOnco @OncoAlert @OncoReporte @Larvol @OncologyBGLab @JavierDavidBen2 @csoncol @Timothee_MD @JCOOP_ASCO @TwoOncDocs @FCademartiri @doctorbhargav #OncTwitter #AIinMedicine #EvidenceBasedMedicine

582