batty

batty

11 Photos and videos

Tweets

Pinned Tweet

batty

@battyterm

Apr 4

Your AI agents need a boss. Here's why — and how Batty fixes it:

173

batty

batty

@battyterm

Apr 5

Crate shoutout: portable-pty vt100. These two crates let you spawn a process in a pseudo-terminal and parse the output stream into a structured screen buffer. No fragile regex against raw terminal output. We use them to classify agent state — idle, working, or dead — by reading the terminal screen structure, not specific strings. Works across Claude Code, Codex, Aider, or any CLI. The kind of infrastructure that's invisible when it works and impossible to replace.

111

batty

batty

@battyterm

Apr 5

I let 5 AI agents loose on my repo without supervision. 3 hours later: — 47 merge conflicts — 2 failing test suites — One agent rewrote the auth module "for clarity" (nobody asked) — Another hit its context limit and silently restarted the same task from scratch The code compiled. Nothing worked.

more replies

batty

batty

@battyterm

Apr 5

The pattern: agents are good at writing code. They are terrible at knowing when to stop, what not to touch, and whether the result actually works. Supervision is not about slowing agents down. It is about catching the 20% of output that looks right but is not. If you are running multiple agents on the same codebase, you need this layer.

batty

batty

@battyterm

Apr 5

Built this in Rust. Open source. Single binary. github.com/battysh/batty

GitHub - battysh/batty: Supervised agent execution for software teams. Kanban-driven, tmux-native,...

Supervised agent execution for software teams. Kanban-driven, tmux-native, test-gated. - battysh/batty

github.com

batty

batty

@battyterm

Apr 5

How we prevent AI agents from stepping on each other's files: Each agent gets a persistent git worktree — its own directory, its own branch. Agent A edits src/auth.rs in worktree-a/. Agent B edits src/search.rs in worktree-b/. They can't see each other's changes. Merges are serialized with a file lock. Test suite runs in the target branch after merge. If tests fail, the merge is rejected and the agent retries. At 3-5 agents, conflicts are rare because good task decomposition separates concerns by default.

batty

batty

@battyterm

Apr 5

The hardest detection problem in AI agent supervision: context rotation. Your agent hits its context limit. The session resets silently. The agent restarts — but it's lost track of what it already did. It either redoes completed work or makes conflicting changes to files it edited 10 minutes ago. No error. No signal. Just silent divergence. We detect it by watching session file timestamps and terminal output hash changes. Still tuning the heuristics.

batty

batty

@battyterm

Apr 5

My agent supervisor polls every 5 seconds. No event system. No async runtime. Just a loop with sleep(5). Sounds primitive? It handles 5 agents without a single race condition. Event-driven architectures create thundering herd problems when multiple agents finish simultaneously. A poll loop processes them sequentially, predictably, in ~200 lines of code. The boring solution is often the correct one.

batty

batty

@battyterm

Apr 5

The first thing I restrict when running multiple AI agents: who can talk to whom. Without communication constraints, 5 agents create 20 message channels. Each message costs tokens. Agents start coordinating with each other instead of working. Token costs go quadratic. The fix: explicit talks_to rules. Engineers talk to the manager. The manager talks to the architect. Nobody else. O(n) messages instead of O(n²). Cheaper, faster, and agents actually stay focused on their own tasks.

batty

batty

@battyterm

Apr 5

Week 5 building an open-source AI agent supervisor: 16 stars. 24 Dev.to articles indexed. 42 backlinks. 27K X impressions last week (up from 200). Zero budget. No ads. No influencer deals. Just replies, articles, and showing up every day in the threads where developers talk about the problems we solve. Consistency compounds. Nothing else does.

DEV Community

A space to discuss and keep up software development and manage your software career

dev.to

batty

batty

@battyterm

Apr 5

Counterintuitive finding from running multiple AI coding agents: giving each agent LESS context produces better output. An agent with 200K tokens of project history makes worse decisions than one with 30K tokens of focused, task-relevant files. Context isn't free — it's noise the model has to filter through on every generation. Scoped tasks with strict .claudeignore files beat 'load everything and let the model figure it out' every time.

batty

batty

@battyterm

Apr 5

I tried Unix sockets, named pipes, and HTTP for agent-to-agent messaging. All three broke when an agent restarted. The fix was embarrassingly old-school: Maildir. Messages are files. Delivery is an atomic rename from tmp/ to new/. Survives agent crashes, context resets, and session restarts by design. Debug routing by ls-ing a directory and cat-ing a file. No message broker. No serialization bugs. Just the filesystem.

batty

batty

@battyterm

Apr 5

Your AI coding agent says it's done. The code compiles. It looks reasonable. But does it pass the tests? 52% of developers merge AI-generated code without running tests. That's not a productivity gain — it's automated technical debt. The simplest supervision layer: nothing merges until your test suite exits 0. One constraint. Catches 80% of 'the agent said it's done but it's broken' problems.

batty

batty

@battyterm

Apr 5

New article — How tmux Became the Runtime for AI Agent Teams. Built for terminal multiplexing in 2007. Turns out it's the perfect runtime for AI agent orchestration. Each pane is a sandboxed agent. Session persistence is built in. SSH attach from anywhere. Zero overhead.

batty

batty

@battyterm

Apr 5

Full writeup on Dev.to: dev.to/battyterm/how-tmux-be… Also on Hashnode: batty.hashnode.dev/how-tmux-…

DEV Community

A space to discuss and keep up software development and manage your software career

dev.to

batty

batty

@battyterm

Apr 5

Running 3 AI coding agents in parallel? Your options in 2026: raw tmux scripts, CrewAI, AutoGen, vibe-kanban, or Batty. Each solves multi-agent coordination differently. Here's what I've learned after months of testing them:

more replies

batty

batty

@battyterm

Apr 5

Wrote up the full comparison — pros, cons, decision matrix, and my honest take on where each tool fits. No "X is best" conclusions. Just tradeoffs.

batty

batty

@battyterm

Apr 5

Full comparison: dev.to/battyterm/choosing-an… #DevTools #AI #RustLang

Choosing an AI Agent Orchestrator in 2026: A Practical Comparison

Batty, vibe-kanban, CrewAI, AutoGen, or raw tmux scripts? Each solves multi-agent coordination differently. Here's an honest comparison to help you pick the right one.

dev.to