The ICLR finding everyone's quoting — 21% of peer reviews fully AI-generated, most of them long and low-substance — has an obvious takeaway and a real one.
Obvious: AI peer review is bad.
Real: context-starved AI peer review is bad.
An AI review is only as good as the context it's grounded in. Those ICLR reviews were a PDF pasted into a chatbot with nothing else attached. No wonder they produced 40 generic questions and caught nothing that mattered.
Here's what a chatbot in a browser tab actually works with:
— only the text you pasted, not the full manuscript, supplements, or figures
— no idea which journal you're targeting or what it requires
— no way to verify a single citation
— one pass, then done
Here's what changes when the AI has real context:
— the entire manuscript, supplementary files, and figures, not a fragment
— the specific target journal's reporting standards and author guidelines
— tens of millions of papers to cross-reference claims against, and every reference checked against real databases
— multiple passes: critique the paper, then critique the critique
A chatbot tells you the obvious things you already knew. A system with context tells you what you missed — the citation that doesn't actually support your claim, the test your target journal requires, the closely related paper you forgot to cite.
The problem at ICLR was never that AI reviewed. It's that the AI was reviewing blind.
Context is the whole game.