Itās insane.
24x token volume against 60-70% annual cost-per-token compression puts hyperscaler unit economics on whether agent loops produce enterprise outcomes that justify procurement.
The 120 quadrillion projection assumes the loops actually ship deliverables. If the NBER survey holds and 80% of firms see no productivity gain at current usage, the demand curve flattens long before chip efficiency does.
Goldman Sachs: "Token use by AI agents is expected to multiply 24 times by 2030"
AI agents are now creating the first serious cost test for the AI boom. As was reported this week, Uber and Microsoft are already rethinking expensive agent usage.
A chatbot may answer once, but an agent plans, calls tools, checks results, edits mistakes, and repeats the loop.
That loop can make one user request consume 10x, 50x, or even far more tokens than a normal answer.
Goldmanās bullish case is that monthly token use could reach 120 quadrillion by 2030, while inference cost per token keeps falling 60%-70% per year.
The fight is now between agent productivity and token waste.
Earlier this month, Microsoft began revoking developer access to Claude Code, with plans to move them to its in-house Copilot Command Line Interface tool by June 30. The company has framed this as consolidating teams around its own tools, but the timing at the fiscal yearās end hints it may also be about lowering costs.