Everyone got the economic implications of the Bitter Lesson backwards.
A little awkward for the labs: the theory they cite to explain their dominance is the same one that says their core product can't be a defensible prize.
The lesson gets quoted as if it were universal — enough compute, and general methods win, everywhere. The fine print is that this only works where "good" is cheap to verify.
Point compute at a crisp objective with a reasonable feedback loop and it eats the task, which is why math and code fell first. And "the lesson applies here" turns out to be the same as "this domain has no margin left in it" — compute isn't scarce enough to lock anyone out (every hyperscaler will have enough of it), and open source distills the frontier on a short lag regardless.
Capabilities converge, and the pressure shifts to better routing across models to minimize cost.
Winning a verifiable domain comes with the curse of being commodified on the other end of the race.
But here is where things get interesting, and why
@satyanadella is playing one move ahead of many: the other half of the economy is less accommodating. Is this the right strategy, the right early-stage investment, the right hire, the campaign that actually captured a new trend — here "good" is expensive, contested, and slow to learn, so there's no cheap objective to point compute at, and the Bitter Lesson isn't binding.
Hard-to-verify domains stay valuable for an almost embarrassing reason: nobody can tell, cheaply, what counts as a good answer. Compute will hand you a thousand of them and do a sloppy job at telling you which one is the one you should ship.
Taste, judgment, curation based on experience: that's the surface where the Bitter Lesson, verification being the true bottleneck, delivers defensible businesses.
Which tells you where value will accrue. The models — mostly compute, data, and increasingly tooling and architecture — are the inputs that commoditize first, so putting your moat there is choosing, deliberately, the race with no margin.
The interesting move is one level down, in the sectors the lesson can't reach, and there the only durable asset is whatever defines "good" where good isn't easily verifiable: your own verification stack. The proprietary record of what actually counted as right — the "this was wrong, here's why" that has to come from outside the loop, or the model will Goodhart itself into confident, sycophantic nonsense.
That moat is two things, and the second is load-bearing: the ground truth your work generates, and the expertise that keeps improving it. The trace is a record — copyable, perishable, only as current as your last correction. The expert is the renewable source: the standing ability to look at the model's next thousand answers and say which one is right, in a domain where that expertise lives in nobody else's head.
Compute can't automate a signal it can't measure, and open source can't distill weights that are still inside your people. What stays scarce is the firm that can keep climbing up the intelligence value chain.
@satyanadella makes the operating version of the case below — your own hill-climbing machine, with the models learning inside it. The polite way of saying: bet where there is no cheap verifier yet (and ideally there never will be), and hire and retain the experts who are the top verifiers in their domain.
Which leads us back to the labs and that awkward spot. They can't defend the model — the theory they live by commoditizes it first, and competition is thorough that way — so they have to come for your eval and verification layer instead: a complement worth absorbing, the kind that arrives bundled into the platform as a thoughtful free feature, for your benefit, naturally.
"the model alone is no longer the product"— Greg Brockman
To succeed, the one input they have to quietly fold in is the ground truth your own operation throws off every day.
Your moat was always the private traces — your failure cases, your taste, the benchmarks only your own work produces.
Outsource your traces and you've outsourced judgment, which was, the job.