This isn’t practical advice. I can see how an approach like this can help you build up a prototype or MVP quickly, but you can’t iterate on users feedback and build a business like this.
You’re shipping too slowly if you’re working to this cadence. The amount of code you can produce in say a 8-12 hour “night shift” is too much to review, and would be reviewed far too infrequently.
The only way it could work would be if it were shipping to live users automatically and could interpret business goals and user feedback to drive requirements independently.
We struggle for bandwidth with 2 devs working 3-4 parallel streams each with several review points through the day.
Tons of folks are piling in here saying that AFK agents are a myth.
I have been using them to ship these GitHub repos:
mattpocock/evalite
mattpocock/sandcastle
mattpocock/software-factory (might be public by the time you see this)
Here are a few steps to making this work, and some reality checks.
Definitions
Let's split this into the day shift and the night shift. Day shift is planning/review/QA, night shift is AFK implementation.
Day Shift (part 1)
1. Use /grill-me to align with the AI
2. Use /to-prd and /to-issues to create a PRD (the destination) and implementation steps as separate tickets, which can be grabbed in parallel (the journey)
3. The PRD is a ticket, but it's not an actionable step. You just put the user stories there
This is pure requirements gathering shit, same as it ever was.
Night Shift
1. I run a planner agent which looks at all the tickets and sees what can be worked on now, and what's blocked
2. The planner agent then kicks off multiple agents (sandboxed using Sandcastle, my OSS tool) to implement the code
3. I then have an automated reviewer agent look at the commits produced - one agent per implementation. This checks alignment to the original PRD, as well as code quality
4. These commits end up on branches that get PR'd to main
5. The planner agent runs again until all work has been completed
The review is a crucial step - it's saved me MANY times. I am planning to massively increase the amount of review I do, hopefully with multiple agents.
But guess what - AFK agents sometimes produce bad code. This can happen because of:
a. The original plan was bad because the best solution was something different
b. The original plan was bad because it didn't take into account all the unknown unknowns, and the AI had to make some decisions during the coding session which were bad
c. The plan was good, but the AI just shat the bed (twice, once in the review stage, once during implementation)
d. Your codebase is bad and the feedback loops don't tell the agent if it did a good job or not
So... QA:
Day Shift (part 2)
1. QA all of the branches created
2. Create follow-up issues, potentially editing the original PRD to adjust the destination
This will usually take a long time, often as long as planning. But then you kick off the night shift again.
Once QA is all done, you review the important bits of code manually, usually in PR's. There isn't anything better than the PR UI right now, so that's what we're stuck with.
Wake-up Calls
1. If you let the AI run all night unbounded by planning, it's going to produce shit code
2. Mostly, my loops finish before I go to bed, it's just the night shift catching up to the day shift
3. The only reason I do AFK at all is because it allows me to automate review and totally not give a shit about latency
4. I always run night and day shift in parallel. I can't plan that far ahead (skill issue, probably). I need working code to base my plans from, so I'm aggressively QA-ing stuff that lands