Model routing for Claude Code. Route subagents to open source cloud and on-device models for 60-90% cost reductions with par quality.

Joined June 2026
1 Photos and videos
There's a spectrum of model routing complexity. Optimally routing hierarchical agents (with nested subagent tasks) across cloud and on-device is highly complex. But will that be valuable? David's article suggests that yes, it will be.
62
Model routing for coding agents must happen at the subagent point of delegation. Doing so is the highest point of leverage for quality-neutral cost efficiency, and it will become more important as frontier models become better managers of deep agent hierarchies.
161
Introducing Rayline: a model router built specifically for Claude Code. Claude Code's API rates are 8-10x its subscription rates. Most of those tokens are going to easy subtasks. Plug in Rayline, and subagents get routed to open source and on-device models instead of burning Opus-level spend on grunt work. Quality holds. Costs drop 60-90%. What makes Rayline different: - Routes at the subagent/subtask level - On-device routing via MLX (Qwen 3.6 and others) - Built-in ML router trained for Claude Code tasks - Cloud fallback when Anthropic has outages - Works with OpenAI models inside Claude Code You keep using Claude Code exactly as you do today. Rayline handles the routing underneath. We're already routing billions of tokens per day for individual developers and publicly traded companies. Public beta is live now. Try it at Rayline[.]ai
1
1
1
136