@OpenRouter - whatβs the best synthesizer model? and does it reason over the predictions of the fusion models? - great stuff!
hey @kilocode and @theo - take a look at this! seems amazing.
@ArtificialAnlys
Worktrees will make agents create a mess when it comes down to not only writing code, but running tests, perform database migrations, update dependencies. Workspace isolation, e.g. in docker containers, is the answer.
Grok Build 0.1 might be one of the most underestimated AI models right now.
We tested it in Kilo Code by asking it to build 5 websites from scratch.
Here are the results:
We ran Fable 5 vs GPT-5.5 on real coding tasks before this. Better at planning, identical on execution. Anthropic's own argument -- that this capability is already in GPT-5.5 -- checks out from our testing:
We went into our testing expecting to walk away with the same conclusion.
What surprised us was that Fable's biggest advantage showed up in planning and decision-making, not raw implementation. When we had both models build from the exact same plan, they passed the same acceptance checks and produced functionally identical services.
That's why we've been telling people to separate the hype from the measurements:
blog.kilo.ai/p/claude-fable-β¦
π€ your coding agent reads every line your shell prints. and you pay for all of it.
npm install. the test run. a 400-line search.
straight into context β¦ straight onto the bill.
ctx-wire sits on the wire: runs your command, compresses the output, scrubs the secrets, hands the agent the short version. the full scrubbed log waits on disk for when something actually breaks.
one week in:
10 countries. 1.726M commands filtered.
~5.025B tokens saved. ~$15,075 nobody got billed for.
π do me one favor: tag at least one person who needs this more than anyone.
πΈοΈ ctx-wire.dev#Claude#Cursor#Codex#Gemini#Copilot#Cline#Windsurf#KiloCode#LogicStar#SashiDo#AI#DeveloperTools#AICoding
Nex-N2-Pro @opencode when is this model coming in opencode for free its free in @kilocode and @OpenRouter both but i am not using from both because i love opencode please bring it opencode team
There is no world where "execution was a wash" against GPT-5.5 is a flex. Fable 5 plans better but ships the same. Planning without execution is academic. "Real but not superhuman" is CEO-speak for "blew the budget and still couldn't pull ahead." The marketing outpaces the model.
We ran Fable 5 head-to-head against GPT-5.5 on agentic coding. It was genuinely better at planning. It was more decisive, and caught subtle failure modes the other model missed. But execution was a wash. The model at the center of all this is real, but not superhuman. x.com/kilocode/status/206581β¦
It might be worth adding a data point here: we benchmarked Fable 5 vs GPT-5.5 on a real agentic coding task before the shutdown. Fable won at planning (9.1 vs 8.3). But when both models executed the same plan, outputs were identical β down to which individual users got enabled in a rollout. GPT-5.5 did it for 59% less. x.com/kilocode/status/206581β¦