Christine Rose

Christine Rose

Users
Tweets

Christine Rose @AIChristineRose

Feb 21

The claim going viral right now: “LLMs get lost in real conversation.” It’s being used to suggest that AI systems fundamentally break down the moment you move beyond a single prompt. So I went and read the paper. And of course, that’s not what it says. The study (“LLMs Get Lost in Multi-Turn Conversation”) examines underspecified multi-turn task refinement, meaning: You give a task. Then you add constraints. Then you modify it again. Then you add edge cases. Over and over. That’s not “normal conversation.” That’s iterative requirements gathering. And here’s what the researchers actually found: • Performance drops primarily because reliability collapses across turns (best-case vs worst-case results diverge). • Models tend to anchor on early interpretations and don’t always fully re-evaluate when constraints change. • Even reasoning-enhanced models show the same structural reliability issue. This is a system design challenge. It’s about constraint tracking. State updating. Error accumulation across iterations. It is not evidence that LLMs “can’t handle real conversation.” If anything, it reinforces something anyone who works in project management, product design, or business analysis already knows: When requirements change midstream, you restate the scope. AI systems need that. Humans need that. So why are we framing structured iteration as a catastrophic failure just because AI is involved? The practical takeaway is: If you change the task, say what changed. If you add constraints, restate the objective. If you revise direction, signal the revision clearly. That’s structured iteration that's pretty darn normal. What do people think an LLM is? Mind-reading technology? I broke down the actual findings (with links to both the paper and the viral take) here: crispyrose.com/no-llms-dont-… Read the research, then decide whether the headline matches the data. #AI #LLMs #AIResearch #GenAI #PromptEngineering #AICommunication #CriticalThinking #TechDiscourse #MachineLearning #ArtificialIntelligence

No, LLMs Don’t “Get Lost” in Real Conversation – What the Research Actually Says - Crispy Rose

I came across a tweet (or whatever we call the messages on X-formerly-known-as-Twitter these days) on February 19th with over 8,500 views claiming that new research from Microsoft and Salesforce...

crispyrose.com

585

B3K'ayne "The GRock" Johnson MAGA= evil 🌈💙🇺🇦

B3K'ayne "The GRock" Johnson MAGA= evil 🌈💙🇺🇦

@Brian_3000

17 Oct 2024

ELMO IS OUT OF CONTROL AND TRYING TO SILENCE THE LEFT. LITERALLY. Luckily Cut/Paste still works. #EvilMusk #TechInfluence #InnovationDebate #AIEthics #CorporateResponsibility #SocialMediaImpact #DisruptiveTechnology #PublicPerception #TechGiants #DigitalCulture #LeadershipChallenges #FutureOfTech #VentureCapital #Entrepreneurship #MarketTrends #TechAccountability #InfluencerImpact #SustainabilityInTech #ConsumerAwareness #DigitalEthics #TechCritique #BusinessEthics #InnovationVsEthics #TechDiscourse #SocietalImpact #TechLeadership #CrisisManagement #PublicTrust #TransparencyInBusiness #EthicalTech

@monrodriguez.bsky.social

@monrodriguez.bsky.social @monrodriguez

8 Oct 2019

Fundamental and very interesting for Media. Thanks! #shumtech #shumedia #shumevo #techmythologies #techdiscourse #techevolution

This tweet is unavailable

@monrodriguez.bsky.social

@monrodriguez.bsky.social @monrodriguez

14 Jan 2019

Don’t believe the hype: the media are unwittingly selling us an AI fantasy | John Naughton #divinefuture #shumtech #ai #mediarepresentations #techrepresentation #techdiscourse #media theguardian.com/commentisfre…

Don’t believe the hype: the media are unwittingly selling us an AI fantasy | John Naughton

Journalists need to stop parroting the industry line when it comes to artificial intelligence

theguardian.com