A fascinating aspect of Open Agents is how it uses the
@WorkflowDevKit SDK to elegantly orchestrate agent sessions efficiently.
But before we go deeper, let's understand the problem:
Open Agents is a coding agent that runs in the cloud. Naturally, the expectation here is that I should be able to kick off multiple sessions and go to sleep and then wake up to a stack of PRs.
This inherently means, the sessions should be
- long running
- stateful
- resumable
- reliable
A naive way to solve this:
- Write an API to start a session, create a session ID and store it in a database - along with other metadata.
- Trigger the session inside a sandbox (Vercel sandboxes can run up to 5hours) and stream the responses back to the client using a websocket
- Store the responses to the db so when a client reconnects to a session using a session ID, we can resume it by fetching the results and restarting the session
Why wouldn't you just do this?
3 months ago I started building a coding agent that runs in the cloud.
It's since written every line of code I've shipped, including itself.
Today, I'm open sourcing it. Introducing Open Agents.