Okay, I've been really in a groove with
@smallharness today, so decided to finally cut the feature I felt like I need for a true v1.0 release.
And this is, model routing...but kinda model routing Morgan-style I guess, because I've been testing out different approaches lately, and found something pretty interesting.
At a high level, I've been thinking that it doesn't make sense to have one model to orchestrate, one to write code, and one to review, and I've been playing around with different configurations.
What I've determined, at least for me, lately, is that I actually want a different model to orchestrate simple tasks vs. complex tasks, and I also want different agents to do coding tasks, based on how much thinking depth/tool calling I need, etc.
Also in some cases, I might want the same model but at different effort levels, like I learned with Fable where I could do a lot more with low than I expected, but there were some tasks I wanted medium for, and of course, crazy complex architecture stuff that I wanted high or even max for.
Same for code review. For MVPs and stuff I'm playing with, I just want fast and cheap, simple code review. But for production code, then I want way more in-depth code review, a better, more expensive model that goes much deeper.
I've come up with a series of roles, and this is all now built into Small Harness. Finally got my idea, into code, and into a harness that can help you write code, using this methodology.
Here's the high-level on it.
The Roles
-----------
The config lives under modelSystem in agent.config.json:
👑 Selector: the decision model. This should usually be your strongest/highest-effort model.
🐙 Orchestrators: not just one orchestration model, but three, a different one for each level of task complexity: low, medium, high.
🧑💻 Coders: like the orchestrators, not just one model to execute/write code, but different models based on the complexity of the coding task. Some plans might use something like two low and one medium, and never need a high.
✅ Code reviewers: three types, play, production, and security. You don't need as detailed code review for stuff you're just playing around with, but you do for production, and your security review model might be different from both.
And I made a chart, aptly titled, Morgan's Wacky Model Routing Idea. That you can look at if you want to do a little deeper dive into what I'm thinking here.
Now live on Github, free and open source, link to the rep in first comment below.