Too much groupthink on how to rapidly increase AI spend. Wrong focus. The only metric that matters is productivity.
Anyone can burn billions of tokens and blow the entire budget. Early agent adoption was the perfect example of wasted token usage with negligible outcomes (some even highly negative via funds lost or data deleted).
These past months we have focused on:
* Empowering people to accomplish 10x scope and productivity of what they could before
* Building productivity tools and exploring new AI applications
* Educating each other on model tradeoffs
A lot of my time at Meta on data infrastructure was invested in how to efficiently process data at scale. We were processing exabytes per day on hundreds of thousands of machines. Every meaningful efficiency gain was worth tens of millions and multiples more today with the explosion of compute & data.
AI is an exponential of data infra. Open-source models are a big part of cost / productivity optimization. For every AI app/tool I build, I can dynamically choose the model based on the need - e.g. haiku vs sonnet vs opus. Combining different frontier models with open-source options means an easy order-of-magnitude improvement in spend-to-outcome.
I believe that even more important than model quality will be the challenge of optimizing $/outcome over the next decade for every entity in the world. With AI spending in the trillions during the next few years, this is a multi-trillion-dollar problem to solve.
Your margin is my opportunity: AI versionโฆ
The biggest surprise of 2026 is that the capability gap between the best open-weight/source models and the best closed models has narrowed much faster than the pricing gap. The pricing gap remains enormous while the capability gap is quite narrow.
What does this means in practice?
For a company consuming 1 billion input tokens and 1 billion output tokens per month:
GPT-5.5 Pro: ~$105,000
Claude Opus 4.8: ~$30,000
DeepSeek V4 Pro: ~$5,220
DeepSeek R1: ~$2,740
I asked ChatGPT what it thought about this and it answered as follows:
โIf I were building a company today, the economic frontier would look roughly like:
DeepSeek V4 Pro / R1 for high-volume inference.
Claude Opus for premium agent workflows where reliability matters.
GPT-5.5 Pro only for workloads where its incremental capability demonstrably produces enough business value to justify a 20โ40ร token premium.โ
Most CEOs have no idea that, instead of this nuanced approach, their teams are running amok internally by picking the most expensive models in most cases and burning through massive budgets with zero governance, audit ability and control.
As control planes like our Software Factory become more standard, you can expect the run rate revenue growth of the frontier labs to go down meaningfully and the revenues of the open models to skyrocket.
Why? Because we can implement the nuanced approach above and be agnostic to model - instead focusing on customer intent, model task and cost management among other things.