Okies, lot of talk about auto routing of models, so to test which ADE related auto routing is more capable and efficient I gave
@cursor_ai @DevinAI @GitHubCopilot
the same 30 terminal tasks that are taken from terminalBench 2.1 and asked them to solve them.
The main idea here is:
- Can current auto routing efficiently figure out when it needs a more capable in turn actually can cost you more ?
- Can current auto routing efficiently figure out when it doesnt need costly much more capable models but could inturn break caching, so is it cost efficient at the EOD ?
and finally.. the most basic one
- Can it actually figure out just based on prompts and mid sessions, routing to better ones. TLDR on capability front:
@cursor_ai beats other two in pure capability way and also mid in cost per say because of their fixed auto pricing per token.