From Free AI to AI Tokenomics?
- As OpenAI, Google, Microsoft, Anthropic and others move toward usage/token based pricing, exploding token usage is good for AI infrastructure companies, but it also means exploding AI costs for enterprise customers. AI is no longer software that feels almost free to use. It is becoming a compute utility where usage directly translates into cost.
- As the underlying cost of compute becomes more transparent and directly tied to output, the AI ROI debate will start getting answered in real time across millions of users and use cases. The conversation may shift from “AI improves productivity / employees work faster / coding speed goes up / customer support costs go down” to “how much did this agent’s task cost / what value did the task create / is this model effective and replaceable?” That shift can create a new industry around AI cost optimization.
- This may also change how companies manage AI internally. The current policy of freely letting employees use AI could move toward setting token budgets by use case. Frontier models may be allowed only for specific tasks, while simpler or repetitive work gets routed to open-source or lower cost models. Enterprise AI policy could move from “let people use AI” to “which model should be used for which workflow, under what cost limit?”
- Because companies will need to track AI usage and cost, areas like model observability, token usage monitoring, cost attribution, model routing, policy enforcement, AI governance, and ROI tracking by workflow could become important. At that point, AI is not only a productivity enhancer for employees. It also becomes a competitor for budget. This will likely be reinforced by the development of open source models, local models, and low cost Chinese models. AI usage will keep growing. But who pays for that cost? Who can pass it through to customers? Who can protect margins through optimization?
- None of this means the AI trade is over. The AI long trade can continue, supported by revenue growth at big tech companies, rising token usage from frontier models, and AI adoption in high value areas like finance, technology, and customer support. But as AI costs rise, the emphasis may shift toward cost reduction and efficiency. Local and ondevice inference, model compression, smart routing, observability, more efficient architectures, and low cost specialized models built on open source stacks could become increasingly important themes. AI usage will keep increasing, but not all of that usage necessarily has to flow into hyperscaler data centers.
- In that optimization process, there will also be AI losers. SaaS companies that cannot pass AI costs on to customers, companies that add AI features but see inference costs grow faster than revenue, labor-based services that can be replaced by AI, seat based software models that get disrupted by usage based AI models, and data/content providers that bear the cost but fail to capture the value could all fall into the AI loser bucket.
- Related areas worth looking at include memory cost optimization, DRAM efficiency, HBM bottlenecks, and alternative architectures. Ondevice AI, open source/low cost models, and AI cost monitoring/governance also seem worth tracking.
Our team’s work really shines when covering a wide range of topics, cutting directly to the actionable chase of each.
Our recent State of the Themes is no exception. If you haven’t read it, I highly recommend doing so.
citriniresearch.com/p/state-…