Thinking in Public | We cooked!
From recent doom scrolling:
Token costs spiralling out of control, one company dropped $500m in a quarter, SalesForce dropped $300m last year.
Dependency on agent workflows will put companies in a doom spiral of ever rising token costs and vendor lock.
Enterprise AI pilots failing left and right.
Companies will have to pay for token bills, plus FDEs, plus seat licenses to use their enterprise software.
McKinsey shouts No ROI, costs escalating.
Then I heard some public market investor say, they were bullish on CPUs.
Then I saw, Intel CEO, possibly just talking their book, say that
“infrastructure ratios could move to "4 CPU to 1 GPU" for agentic workloads.
Wait, wut.
I thought we were GPU maxxing.
Wut meanz if we be CPU maxxing.
If it meanz Open Source models running on local devices on enterprise software via MCP, then everything is going to be just fine.
(at least from the perspective of being in the agentic workflow business, we’re all toast otherwise)
When agents are doing real work (not generating haiku), but orchestrating workflows across CRMs and productivity tools they move from GPU land to CPU land.
Instead of just predicting tokens, they're calling tools, retrieving context, and coordinating actions across systems.
Ain't no need for frontier models to do that stuff.
I’ve always encourage my team to burn tokens and save time. But for repetitive tasks,
do you really need the world's most powerful model to update a CRM record, schedule an interview, approve a PTO request, or route a support ticket?
Agent workflows can consume 1000x the tokens of simpler AI tasks, but if they are performing non-complex tasks, they could just use small open source model running on a laptop.
Layer in smarter routing of tasks to value engineer token consumption, more efficient models, more powerful CPUs, and it’s all going to be OK.