Spent a while iterating on a version of this process in my Pi harness, and I gotta say he was right.
My old setup was heavily human in the loop, because nothing I did nailed the balance of guardrails & autonomy to prevent a multiple hour run from turning into slop.
Now I spend an hour hashing out requirements in extreme detail, let the agent loop for 8-12 hours, and out pops beautiful, clean new features. Slop minimal.
This feels like a turning point, for me at least. I may be late to the party, but the party isn't over.
nobody wants to hear this but the classical NASA systems engineering is the perfect model for developing code with LLMs. people try to approximate this with planning modes, but if you’re explicit in your docs it’s never been easier to build, test, and verify complex codebases.