Same agentic workload pattern. Two completely different cost profiles.
Top: undisciplined context, fresh tokens every turn, $1.3M/month. Bottom: governed context via MCP, prompt cache hit rate near 100%, fraction of the cost.
Frontier model providers just told you what they think agents cost. The Anthropic Agent SDK credit cap. The Codex subsidy that survives only because Peter works at OpenAI now. Both are signals.
The pattern that survives when the subsidies end isn't bigger context windows or faster models. It's context engineering. Code-maps over MCP. Semantic retrieval. Cache-aware prompt structure.
Build for cache hits, not token volume. The economics will follow.
For comparison, my own monitor over the same window shows 12.26B cache reads against 692K fresh input tokens. ~17,700:1 cache-to-fresh ratio. Cost profile is two orders of magnitude lower for comparable agentic workload.