Introducing Rayline: a model router built specifically for Claude Code.
Claude Code's API rates are 8-10x its subscription rates.
Most of those tokens are going to easy subtasks.
Plug in Rayline, and subagents get routed to open source and on-device models instead of burning Opus-level spend on grunt work.
Quality holds. Costs drop 60-90%.
What makes Rayline different:
- Routes at the subagent/subtask level
- On-device routing via MLX (Qwen 3.6 and others)
- Built-in ML router trained for Claude Code tasks
- Cloud fallback when Anthropic has outages
- Works with OpenAI models inside Claude Code
You keep using Claude Code exactly as you do today. Rayline handles the routing underneath.
We're already routing billions of tokens per day for individual developers and publicly traded companies.
Public beta is live now. Try it at Rayline[.]ai