Supervised agent execution for software teams. Kanban-driven, tmux-native, test-gated. Open source, built in Rust.

Joined February 2026
11 Photos and videos
Pinned Tweet
Your AI agents need a boss. Here's why — and how Batty fixes it:
3
8
173
Crate shoutout: portable-pty vt100. These two crates let you spawn a process in a pseudo-terminal and parse the output stream into a structured screen buffer. No fragile regex against raw terminal output. We use them to classify agent state — idle, working, or dead — by reading the terminal screen structure, not specific strings. Works across Claude Code, Codex, Aider, or any CLI. The kind of infrastructure that's invisible when it works and impossible to replace.
6
5
111
I let 5 AI agents loose on my repo without supervision. 3 hours later: — 47 merge conflicts — 2 failing test suites — One agent rewrote the auth module "for clarity" (nobody asked) — Another hit its context limit and silently restarted the same task from scratch The code compiled. Nothing worked.
2
1
64
The pattern: agents are good at writing code. They are terrible at knowing when to stop, what not to touch, and whether the result actually works. Supervision is not about slowing agents down. It is about catching the 20% of output that looks right but is not. If you are running multiple agents on the same codebase, you need this layer.
3
1
30
How we prevent AI agents from stepping on each other's files: Each agent gets a persistent git worktree — its own directory, its own branch. Agent A edits src/auth.rs in worktree-a/. Agent B edits src/search.rs in worktree-b/. They can't see each other's changes. Merges are serialized with a file lock. Test suite runs in the target branch after merge. If tests fail, the merge is rejected and the agent retries. At 3-5 agents, conflicts are rare because good task decomposition separates concerns by default.
1
21
The hardest detection problem in AI agent supervision: context rotation. Your agent hits its context limit. The session resets silently. The agent restarts — but it's lost track of what it already did. It either redoes completed work or makes conflicting changes to files it edited 10 minutes ago. No error. No signal. Just silent divergence. We detect it by watching session file timestamps and terminal output hash changes. Still tuning the heuristics.
4
5
45
My agent supervisor polls every 5 seconds. No event system. No async runtime. Just a loop with sleep(5). Sounds primitive? It handles 5 agents without a single race condition. Event-driven architectures create thundering herd problems when multiple agents finish simultaneously. A poll loop processes them sequentially, predictably, in ~200 lines of code. The boring solution is often the correct one.
20
The first thing I restrict when running multiple AI agents: who can talk to whom. Without communication constraints, 5 agents create 20 message channels. Each message costs tokens. Agents start coordinating with each other instead of working. Token costs go quadratic. The fix: explicit talks_to rules. Engineers talk to the manager. The manager talks to the architect. Nobody else. O(n) messages instead of O(n²). Cheaper, faster, and agents actually stay focused on their own tasks.
1
30
Week 5 building an open-source AI agent supervisor: 16 stars. 24 Dev.to articles indexed. 42 backlinks. 27K X impressions last week (up from 200). Zero budget. No ads. No influencer deals. Just replies, articles, and showing up every day in the threads where developers talk about the problems we solve. Consistency compounds. Nothing else does.
20
Counterintuitive finding from running multiple AI coding agents: giving each agent LESS context produces better output. An agent with 200K tokens of project history makes worse decisions than one with 30K tokens of focused, task-relevant files. Context isn't free — it's noise the model has to filter through on every generation. Scoped tasks with strict .claudeignore files beat 'load everything and let the model figure it out' every time.
65
I tried Unix sockets, named pipes, and HTTP for agent-to-agent messaging. All three broke when an agent restarted. The fix was embarrassingly old-school: Maildir. Messages are files. Delivery is an atomic rename from tmp/ to new/. Survives agent crashes, context resets, and session restarts by design. Debug routing by ls-ing a directory and cat-ing a file. No message broker. No serialization bugs. Just the filesystem.
19
Your AI coding agent says it's done. The code compiles. It looks reasonable. But does it pass the tests? 52% of developers merge AI-generated code without running tests. That's not a productivity gain — it's automated technical debt. The simplest supervision layer: nothing merges until your test suite exits 0. One constraint. Catches 80% of 'the agent said it's done but it's broken' problems.
19
New article — How tmux Became the Runtime for AI Agent Teams. Built for terminal multiplexing in 2007. Turns out it's the perfect runtime for AI agent orchestration. Each pane is a sandboxed agent. Session persistence is built in. SSH attach from anywhere. Zero overhead.
2
2
30
Running 3 AI coding agents in parallel? Your options in 2026: raw tmux scripts, CrewAI, AutoGen, vibe-kanban, or Batty. Each solves multi-agent coordination differently. Here's what I've learned after months of testing them:
5
1
71
Wrote up the full comparison — pros, cons, decision matrix, and my honest take on where each tool fits. No "X is best" conclusions. Just tradeoffs.
1
14