As the industry figures out these new tools we're going to see the emergence of a key new skillset:
Architecting your codebase for agents.
There's an enormous gap between what coding agents are capable of in theory and what they can achieve in practice.
Most of this gap is down to the environment that the agent is operating in. How quickly they can get feedback on their work. How they know whether they're on the right track. Which guardrails are in place. Whether they can run code in isolation.
Unless you've intentionally designed your codebase around these constraints then you're leaving some of the biggest productivity gains on the table. Spoiler alert: Almost no one has.
In an agent-first codebase the guardrails keep it on track. If it makes mistakes your tooling gives it feedback so that it can correct course. You can spin up many agents working in parallel and view their work in an isolated environment before merging it. Your agents write thorough tests because the agent that writes the code and the agent that writes the test are kept separate. When agents fail on a task you put measures in place to prevent them from making the same mistake twice.
Do all that and you end up with a system that lets teams ship much higher quality software, faster and cheaper than ever before.
In a legacy codebase it's a different story. Either you have to carefully review every line of code, or your codebase is going to deteriorate incredibly quickly. If you don't catch the agents' mistakes in time, that code becomes part of their context - they will reliably copy the same mistakes again and again until you've created a monster. An unreviewable mess that no human can recover, and the agents can't help you either. You're stuck.
"I'll just review the code" you say, and sure, but agents produce more code than a 10x engineer, faster than a human can realistically understand it. If humans have to review every line then they become the bottleneck. You become a slave to the agent.
You're reading code, telling the agents where they messed up, and waiting for them to fix it. That is a miserable existence. It's slow, frustrating, and it's hard to feel like the agents are actually empowering you.
And in this case "legacy" doesn't just mean old software. Most new codebases fall under this category too. If you did not intentionally design for AI-powered development then it is simply impossible to unlock the potential. Your project will go fast at first and then, quality and velocity will fall off a cliff - sometimes gradually, sometimes immediately.
"Better models drop every few months, doesn't that change the equation?" - No. This problem is fundamental to LLMs and software engineering in general. Either your codebase is set up for fast iteration and feedback or it isn't. Better models will not save you.
It is really difficult to take an existing codebase and make it agent ready. It's not impossible, but it requires a lot of work and coordination. It needs a systematic approach.
At codemix we've figured out what works. If you'd like to learn more my DMs are open.