Inability to save on costs may be true for teams where every individual is doing AI-assisted dev work from scratch every time they spin up a coding agent.
I think the opportunity here is mostly for teams that have matured towards hardened auto-triggered, cloud agent workflows. The complexity of those is more predictable, the paths (memorialized as Skills) better-trodden.
It seems many
@cognition customers are near this point -
@ido_pesok wrote up recently how such async sessions are eclipsing interactive ones (
x.com/ido_pesok/status/20604…). I bet
@FactoryAI customers show a similar profile. But most companies are nowhere close to this yet.
The Factory team seems very smart, and I'm eager to see how well their cost-optimizing new model router works. If they can pull it off, that is a huge accomplishment.
For Amp: we may try to build a model router that picks the best model for a task, but we don't intend to build a cost-optimizing model router based on the current state of the models. Here's why.
Every time we've looked into using cheaper models in Amp, we've benchmarked on tasks that reflect how people use agents for coding today. On these real tasks, the expensive frontier model was not only the best (obviously), but also usually the fastest and cheapest, when measuring end-to-end task completion.
Why? Cheaper-per-token models are less capable, which means that on complex real-world tasks they spend more tokens and time fixing mistakes along the way.
You can find plenty of cases where cheaper models are indeed faster and cheaper end-to-end. But such cases were rarer than we expected, and the differences were fairly small.
If you can easily detect such cases, then there is an opportunity here. But even then, on the AI hedonic treadmill, once people get a taste of frontier intelligence, they don't want to go back to using those more primitive prompts where cheaper models suffice. (Which is a good part of human behavior! It's how we decided to stop living in caves!)
If your tasks can be handled just as well by non-frontier models, I would strongly advise you to uplevel how you use agents and what you produce to stay competitive against people who are using frontier models.
In a power-law world, with rapid intelligence advances, try to get to the frontier and stay there.