Consultant MSK Radiologist @MWLNHS | Fellowship in MSK Radiology@ SMK Netherlands | MSK Radiologist @ Paris Olympics 2024

Joined May 2019
402 Photos and videos
Getting off algorithm driven social media is the best! Been 2 months since deleting X and Linkedin from my phone, with no social media use except WhatsApp and Telegram. Logging in through my browser just to post this, that friction of not having the app on your phone is necessary
1
4
447
Is letting silicon valley zombify your brain worth the occasional academic radiology post ? The generative AI fiasco proves silicon valley is not on your side, they just want people addicted and hooked to their apps, whether chatbots or social media.
1
308
Do yourself a favour in 2026 and get off as many platforms as you can.
246
Dr. Ameya Kawthalkar retweeted
#OnlineFirst: Unraveling the cause of microspurs in spontaneous intracranial hypotension type 1: discogenic origin or calcified Hofmann’s ligament? thejns.org/spine/view/journa…
2
37
153
7,030
Dr. Ameya Kawthalkar retweeted
⚠️ T-junction injuries of the biceps femoris NEW #Editorial bridging the gap between expert opinion and evidence-based practice 📄 ✅ Read ➡️ bit.ly/438lPE3
28
123
14,902
Dr. Ameya Kawthalkar retweeted
Consent ✅ Footballer - ACL recon & MCL tear 12 months ago Uneventful rehab process & RTP ...... but then developed focal pain over medial joint line & restricted flexion Sore on side passing & cutting / change of direction drills OE - Palpation pain at proximal MCL Stiff & painful into EROM flexion MRI - no medial pathology reported - but on closer inspection, subtle medium signal focal change in deep MCL on T1 & T2 (always check the scan yourself) POCUS - obvious large heterotopic calcific deposit in previously injured proximal MCL fibres - Pelligrini-Stieda lesion Video - needle barbotage / fenestration of deposit, finished with a soupçon of CSI 🫰 Post procedure, complete abolition of pain in gym & typical provocative movements ✅ Ultrasound is much better than MRI in identifying calcific pathology (as is plain x-ray)
9
18
142
20,122
My article "Should I have become a radiologist? The hype versus reality of radiology AI, AI in general and the road ahead" . It is based on my recent talk for @REF_INDIA with a few additions. It is intended for medical students considering a career in radiology but facing a barrage of “AI will replace radiologists” fear mongering and for radiologists and radiology trainees seeking a distilled overview of the state of radiology AI today and what to expect going forward. Link in 🧵
8
8
59
8,846
Came across this AI generated 'educational' radiology post which is just plain wrong but has thousands of likes across X and Insta. We are living through the death of traditional social media, hastened by AI slop which is now apparently coming for medical educational posts. Radiology creators like @drvenkimdrd @drdevrad @RadiologyVibes @teachplaygrub @bhavinj @msk_munoz @GSERRANOB_MSK @drmankad and countless others across different social media platforms put in hours to make sure verified, high quality radiology education reaches people. Do all those who have clicked like on this post know what's correct and what's not here ? Are all those clicking like even human, and what percentage are bots ? Who is responsible when AI generated false medical educational material (not necessarily obvious as in this case but subtle) has a direct adverse impact on patients? There is a need for something new now, a "post-social media" platform which is not algorithm driven for maximising engagement and likes, and which is not AI slop filled. Towards something where only human creators and their creations are valued.
5
8
50
5,748
Dr. Ameya Kawthalkar retweeted
A new code of AI is born: BeResponsibleAI.com Don’t just code! Code responsibly!! AI ideas are launched fast- few are launched responsibly #BeResponsibleAI changes that its a #movement. It’s the world’s first Responsible AI chat that evaluates your AI idea across 5 pillars.
1
1
4
1,275
Dr. Ameya Kawthalkar retweeted
🚨 Just published! All frontier AI models have failed “Radiology’s Last Exam” - the toughest benchmark in radiology launched today! ✅ Board-certified radiologists scored 83%, trainees 45%, but the best performing AI from frontier labs, GPT-5, managed only 30%. ❌ These results shatter repeated claims of “doctor-level” AI in medicine and give you a reality check! 🇮🇳 The Centre for Responsible Autonomous Systems in Healthcare (#CRASHLab), @KCDH_A @AshokaUniv, India has launched v1 of one of the hardest benchmarks in medicine and we share our results with the world! 1/n
45
127
661
201,756
For the upcoming REF AI in Radiology course, we decided to give a part of the course learning material as a prompt. This keeps the learning dynamic and personalised for each delegate as per their understanding of AI. Delegates enter a prompt in their favourite large language model / LLM (ChatGPT, Gemini, Claude, Grok, Llama etc). They can customize the prompt as per their needs and AI familiarity (beginner / intermediate / advanced). "Explain the following topics to me one by one as a beginner. Explain the first bullet point topic in detail then end your answer. I may or may not ask you follow up questions to explain subtopics about it. Once done, I will ask you to move on to the next bullet point topic. 1. Current open source and closed source LLM and VLM ecosystem with list of popular global models (USA, China, Europe and the rest) 2. Comprehensive list of open source medical and radiology foundation models available today, updated as of September 2025 3. Importance of quantized open source vision language models for radiology which can run on CPU only workstations 4. Creating a notebook in Google Colab for chest x-ray foundation model fine tuning 5. The Hugging Face ecosystem" ....And more such topics as a prompt, along with radiology AI quizzes and blogs.
3
1
16
1,832
Link for course registration: rzp.io/rzp/AI2025

1
355
Dr. Ameya Kawthalkar retweeted
Pearl: A buckle fracture involves only one cortex and is usually stable. Pitfall: If fracture line extends through both cortices, even with cortical buckling, it’s a complete fracture. Don’t undercall it as a buckle. Wisdom: One cortex buckled = buckle (incomplete). Both cortices broken = complete. Count cortices, not only curves. One view is NO view. —Pearls, pitfalls and wisdom from my reporting list.
3
23
132
8,783
Interesting how an open source CPU native ASR model would solve a lot of issues in healthcare like excessive workload on doctors and ever increasing waiting lists. Am currently building a medical transcription solution to help our overworked NHS doctors. A local, free, open source app which records doctor patient conversations and automatically converts them into clinical notes, management plans, referral emails. Plus a reporting feature for radiologists and pathologists which automatically selects the correct report template based on dictation and generates the final report (local version of my Wilhelm app). Experimented with a whole lot of CPU based ASR models as most NHS Trusts don't have GPU access. Tried out faster whisper, Vosk and Gemma 3n. Gemma 3n is not going to work on 16 GB RAM CPUs as it natively needs fp32 , almost the entire 16 GB RAM will be taken up by the model weight in fp32 (4B params x 4 bytes/param), it's not even running on my 32 GB RAM CPU. The quantized GGUF versions of Gemma 3n do not support audio. Vosk and the GGUF int8 versions of faster whisper are still not fast enough and not accurate enough to pull off complex medical terminology, worked with them for a few weeks. So I guess back at using open source GPU based ASR models like maybe Olmo or Nvidia's Parakeet which are more accurate than Whisper large. Yet to try them out for diarisation. Possibly using gpt oss 120B as the LLM for tool calls and text formatting. The issue is who pays for the GPU costs if I give this out as free and open source? The Trusts would have to run it on a confidential UK based cloud and would have to bear the costs. Had a very interesting discussion with @danfascia about it recently.
2
2
812
It cannot be called a superintelligence / AGI if it forgets about a medical case if you chat with it about multiple cases for half an hour in the same context window and jumbles things up. Or if you have to reset it's memory every short while and explain the patient's symptoms and labs again and again to follow-up a case. For comparison, one oncology resident in Tata hospital Mumbai sees 150 patients in one day in the outpatient clinic and I have seen those guys maintain the same level of composure, kindness and diagnostic acuity at 5pm as when they started at 8am. And they deal with admissions, treatment strategies and plans for those patients throughout their stay in the hospital. No LLM today can do that! Till these issues are solved they will remain apps and tools for doctors to use. I came across DSPy and GEPA because of your posts @DrDatta_AIIMS and you may be on to something here about using them in a way that doctors may give natural language feedback to AI models about mistakes the models made. This would in turn lead to the model updating it's prompt or a pipeline where the model's weights could be fine tuned based on doctors' feedback. Till continuous learning and memory are solved for AI, it cannot be called a superintelligence/ AGI.
Most people misunderstand what superintelligence really is. Recent papers (including a big announcement from a frontier tech company) have even started claiming “medical superintelligence.” Umm, sorry to break the hype… training a model on the entire internet and then comparing it against doctors denied the same resources is not superintelligence, it’s flawed science! So what would true medical superintelligence look like? To me it’s given the same patient data, the same tools, and the same constraints, the system reliably outperforms trained clinicians across diverse cases, with reproducibility and calibrated uncertainty. This is hard, because medical reasoning isn’t just pattern matching. If you go deeper into medical reasoning, clinicians flexibly combine hypothetico-deductive reasoning, illness scripts, and dual-process cognition (fast intuitive slow analytical). Current autoregressive models get stuck in one mode at a time, and that brittleness shows in controlled evaluations where even trainees still outperform them (Which we showed in our Radiology’s Last Exam benchmark recently). Where AI does look “superhuman” is in discovering hidden signals predicting age, sex, or disease risk from X-rays and fundus images that humans can’t consciously perceive. But correlation ≠ competence; and generalization, bias and safe integration remain unresolved. (Something our lab is actively working on) If you really ask me, a real test would be something like a Same Data, Same Tools (SDST) trial: let’s say 10k studies with full metadata, identical resources for both doctors and models, measuring accuracy, calibration, and patient impact. Only if the model still wins should we call it “superintelligent.” Difficult to do in real life though. Until we achieve true medical superintelligence, the useful path is augmentation… error detection, triage, documentation, workflow automation etc. That’s where AI is already valuable today. Superintelligence is not here yet. We are actively looking at ways to achieve this. Once we do, I will let you know.
1
1
6
1,454
3 projects for radiologists interested in learning AI to get started: The best way to learn something is by doing it. Rather than reading too much theory at the start, the best way is to start creating your healthcare app or training your model and then read the relevant concepts along the way, either by asking your favourite LLM or online / YouTube. Think about AI projects which will benefit your or your hospital's workflow. The hardest thing with AI is actually integrating it into daily work or building something people will find useful. Otherwise education without action is entertainment. For all of the below ask an LLM to write a detailed implementation spec first without code. Then read the implementation spec yourself and discuss modifications with the LLM. Once you have finalized it save it as a markdown file in whichever coding agent you are using. Tools to consider: Frontend: Replit or Bolt AI model training: Google Colab Coding IDEs: Cursor or VS Code Coding CLI (the one I currently use): Claude Code. I use Claude Code inside VS Code. Features to include in each app since you are building healthcare apps where privacy and security are crucial: Encryption Secure auth Role based access control where applicable (admin, doctor) Immutable audit trail Input sanitisation Anti CSRF Session token lifecycle with auto logouts 1. Train your own chest x-ray foundation model using public datasets, and create a web app which generates a report when a user uploads an x-ray. This is the big one and if you really do this thoroughly end to end the other projects will seem very easy and you would have learnt a significant portion of deep learning in radiology along the way. Open source models to consider: Llama, Gemma, Mistral, Qwen (hosting open source Chinese models on your own server does not send your data to China). Don't do this on MedGemma, Llava Med or other models already fine tuned for x-rays. Fine tune a general one from scratch. 2. A report transcription app which is either local or cloud based , depending on whether your workstation has a powerful enough GPU. Workflow: Upload your report templates > Dictate findings in chat box > ASR model transcribes it > Click generate report button > LLM selects correct report template based on your dictation> LLM extracts text from that template and integrates your dictation correctly in it > generates and displays final report to user which they can copy. Everything from clicking generate report to final report display should happen in less than 3 seconds. Few shot prompting works well with most open source LLMs for this. You could go through the 'How I built it' section of Wilhelm for tips on one possible approach. 3. A voice controlled EHR system with separate logins for admins and doctors. Transcription of clinical notes and patient conversations. Automated generation by LLM of differentials, likely diagnosis and management plan based on local guidelines.
1
3
18
1,879