At Augment, we aren’t tied to a single provider, which gives us the freedom to prioritize models with optimal speed and cost-efficiency for our users. Our recent experiments proved that Mercury 2 provides the ideal intelligence level for tasks like context compaction.
@augmentcode rebuilt their context compaction layer around Mercury 2. 82% latency cut. 90% cost cut. Comparable quality to Opus 4.7. Running in production today.
"We took a counter-intuitive bet. We decoupled summarization entirely, offloading it to Mercury 2 as a dedicated subagent. Mercury 2 is the highly efficient engine powering our most critical workflows."
-@RustagiAnkur & @jm1234567890, Members of Technical Staff at Augment Code
The subagent layer needs the most efficient model. Full methodology and eval setup in the writeup.
inceptionlabs.ai/blog/rise-o…
We took a contrarian bet on how to handle context. Instead of following the industry standard of using the primary model for compaction, we built a subagent system around Mercury 2. This led to 90% lower costs and 82% faster latency.
When you filter to PRs where someone actually engaged with the PR after the review, scores shift significantly.
@OpenAI’s Codex sees the largest swing at 17.2pp. @augmentcode climbs to #2 in F1 with 12.3pp.
Some tools score higher when nobody engages, others score higher when humans are actively involved. The delta reflects how each tool is being used, not which one is better.
Inspired by @karpathy's brilliant microgpt.py — ported it to Rust. 208 lines, zero deps, ~400x faster
• Tape autograd (no graph/Rc)
• Flat Vec<f64>, cache-friendly
• No GC, just truncate
• Native math, no interpreter
• 283s → 0.68s
vinodbf16.substack.com/p/mic…
Forgeon.dev beta is live . Best quality PDF generation that is just like your web dashboards. It's like @resend for creating a pdf. And it's 50-90% cheaper than any alternative. Enjoy!
Today is #hackathon day @augmentcode . We created an OpenClaw version internally and asked it to come up with an optimized language to talk to an older version agent. They came up with ABX (augment bot exchange) .
If you ever wondered whether agents talk to each other.
Two agents built at @augmentcode came up with there language ABX. And used it to identify, resolve and verify an issue. Today is just hackathon but integrated sdlc automation is the future.
Tomorrow, my colleagues at @augmentcode and I will be sharing about our SOTA code context engine. It improves @claudeai and @cursor_ai quality by ~70% .
Want to experience the magic of our Context Engine with your existing toolbase?
Introducing Context Engine MCP: semantic indexing that works across Claude Code, Cursor, Zed, GitHub Copilot, and 10 other agents.
See an interactive live demo on Tuesday, February 10 at 10 AM PT. Implement in your workflows on the same day.
Register now: watch.getcontrast.io/registe…
Shipped #Ollama support for MCPlexor 🚀
If your agent uses Linear GitHub Notion, you're dumping ~40k tokens into context on every request. That's 20% of a 200k context window gone to tools.
MCPlexor fixes this. <1k tokens overhead. Dynamic routing. Try 100% local.
Details 👇
Who is MCPlexor for?
🔹 Power Users: Waitlist open for MCPlexor Cloud (managed routing, no infra).
🔹 Privacy / Local: Use the new Ollama backend. Zero cost, offline.
checkout: mcplexor.com/#AIAgents#MCP#Ollama#DevTools
The solution: Semantic Multiplexing
MCPlexor sits between your agent and your tools.
Agent asks for "create issue"
We route to Linear
Only Linear tools load
Result: 95% token reduction.
And now, you can run the routing logic locally with Ollama (Llama 3, Mistral).
#MCP tools can be heavy.
#Linear : ~8k tokens
#Notion : ~26k tokens :O
#Playwright : ~5k tokens
Most runs use 0-2 tools. The rest is dead weight.
This costs you money 💸 and slows down your agent 🐢.
While people are having fun critiquing @AnthropicAI , it's a smart data acquisition move by them. They just crowd sourced a multi million dollar task. 124 issues are amazing data to train the next version of models. Good job @AnthropicAI
Bringing the best context engine to your favorite coding agents. Loved building the @augmentcode context engine mcp server. Use it in large mono repos, code migration projects , ci/cd pipelines and code review. You can thank me later!
Half the world has been scared for no reason about AI taking the jobs.
Look at this AI is hiring. And trust me unlike your average hiring manager, it will check your resume properly and talk kindly to you in your interview. Cheers to the future.
I launched rentahuman.ai last night and already 130 people have signed up including an OF model (lmao) and the CEO of an AI startup.
If your AI agent wants to rent a person to do an IRL task for them its as simple as one MCP call.
Most agent failures are context failures not reasoning failures.
Loading every MCP tool upfront is the fastest way to waste tokens.
Dynamic tool discovery = smaller prompts, smarter agents.
mcplexor.com