ICYMI: On April 30th, Bloomberg broke the story that 20 of the country's biggest news outlets (CNN, NBC, USA Today, Vox) sent a formal demand letter to a nonprofit web archive that has been quietly training nearly every major AI model on the planet.
The News/Media Alliance is asking Common Crawl to remove their scraped content, prohibit AI training use, and add enforceable warnings to its opt-out registry.
If you're not familiar with Common Crawl, it's the open repository of online content that Google, Meta, OpenAI, and Anthropic have all used to train their chatbots. Critics call it "data laundering." It’s the back door to gated information that publishers couldn't close.
Until now.
WHAT THIS MEANS FOR PR:
1. The cost of AI citations is about to go up.
Anything Common Crawl scraped was effectively free training data. As publishers wall off access, AI companies will need to pay licensing fees, narrowing the pool of citable sources. Coverage in licensed pubs will still compound in value. Coverage everywhere else may not.
2. The "publish once, train forever" free ride is over.
Per Muck Rack's 2026 research, half of all AI citations come from content published in the last 11 months. Pair that with publishers pulling back archives, and the half-life of an earned media hit on AI is shrinking fast.
3. Mid-market companies are uniquely exposed.
Enterprise brands have massive archives and decade-old PR programs feeding the citation loop. That leaves mid-market companies competing for the remaining citation share. Every placement has to work harder, and a steady cadence of credible coverage matters more than the occasional tier-1 home run.
BOTTOM LINE:
When Anthropic reached a $1.5B settlement with authors last fall, I predicted that media outlets would soon be split into two tiers: those with AI licensing deals, and those without. The Common Crawl letter signals that prediction is no longer theoretical.
The free-data buffet is closing. The companies building a habit of coverage in licensed publications today will be cited in AI answers tomorrow.
To my PR colleagues: I’m curious, how are you factoring these changes into your 2026 strategy?