Joined January 2012
15 Photos and videos
At Augment, we aren’t tied to a single provider, which gives us the freedom to prioritize models with optimal speed and cost-efficiency for our users. Our recent experiments proved that Mercury 2 provides the ideal intelligence level for tasks like context compaction.
@augmentcode rebuilt their context compaction layer around Mercury 2. 82% latency cut. 90% cost cut. Comparable quality to Opus 4.7. Running in production today. "We took a counter-intuitive bet. We decoupled summarization entirely, offloading it to Mercury 2 as a dedicated subagent. Mercury 2 is the highly efficient engine powering our most critical workflows." -@RustagiAnkur & @jm1234567890, Members of Technical Staff at Augment Code The subagent layer needs the most efficient model. Full methodology and eval setup in the writeup. inceptionlabs.ai/blog/rise-o…
1
3
13
1,822
We took a contrarian bet on how to handle context. Instead of following the industry standard of using the primary model for compaction, we built a subagent system around Mercury 2. This led to 90% lower costs and 82% faster latency.
1
6
44
Ankur Rustagi retweeted
When you filter to PRs where someone actually engaged with the PR after the review, scores shift significantly. @OpenAI’s Codex sees the largest swing at 17.2pp. @augmentcode climbs to #2 in F1 with 12.3pp. Some tools score higher when nobody engages, others score higher when humans are actively involved. The delta reflects how each tool is being used, not which one is better.
1
2
15
3,900
Ankur Rustagi retweeted
Inspired by @karpathy's brilliant microgpt.py — ported it to Rust. 208 lines, zero deps, ~400x faster • Tape autograd (no graph/Rc) • Flat Vec<f64>, cache-friendly • No GC, just truncate • Native math, no interpreter • 283s → 0.68s vinodbf16.substack.com/p/mic…

1
1
2
59
Everyone is projecting the first one person unicorn. Curious are there any one person unicorn/10 ($100m) startups sold yet?
1
21
@grok would you know?
1
17
I think I broke @GeminiApp search. Guess what was the query
11
Forgeon.dev beta is live . Best quality PDF generation that is just like your web dashboards. It's like @resend for creating a pdf. And it's 50-90% cheaper than any alternative. Enjoy!
1
58
Today is #hackathon day @augmentcode . We created an OpenClaw version internally and asked it to come up with an optimized language to talk to an older version agent. They came up with ABX (augment bot exchange) .
1
2
54
If you ever wondered whether agents talk to each other. Two agents built at @augmentcode came up with there language ABX. And used it to identify, resolve and verify an issue. Today is just hackathon but integrated sdlc automation is the future.
1
23
Tomorrow, my colleagues at @augmentcode and I will be sharing about our SOTA code context engine. It improves @claudeai and @cursor_ai quality by ~70% .
Want to experience the magic of our Context Engine with your existing toolbase? Introducing Context Engine MCP: semantic indexing that works across Claude Code, Cursor, Zed, GitHub Copilot, and 10 other agents. See an interactive live demo on Tuesday, February 10 at 10 AM PT. Implement in your workflows on the same day. Register now: watch.getcontrast.io/registe…
1
31
Shipped #Ollama support for MCPlexor 🚀 If your agent uses Linear GitHub Notion, you're dumping ~40k tokens into context on every request. That's 20% of a 200k context window gone to tools. MCPlexor fixes this. <1k tokens overhead. Dynamic routing. Try 100% local. Details 👇
5
5
271
Who is MCPlexor for? 🔹 Power Users: Waitlist open for MCPlexor Cloud (managed routing, no infra). 🔹 Privacy / Local: Use the new Ollama backend. Zero cost, offline. checkout: mcplexor.com/ #AIAgents #MCP #Ollama #DevTools
24
The solution: Semantic Multiplexing MCPlexor sits between your agent and your tools. Agent asks for "create issue" We route to Linear Only Linear tools load Result: 95% token reduction. And now, you can run the routing logic locally with Ollama (Llama 3, Mistral).
20
#MCP tools can be heavy. #Linear : ~8k tokens #Notion : ~26k tokens :O #Playwright : ~5k tokens Most runs use 0-2 tools. The rest is dead weight. This costs you money 💸 and slows down your agent 🐢.
48
While people are having fun critiquing @AnthropicAI , it's a smart data acquisition move by them. They just crowd sourced a multi million dollar task. 124 issues are amazing data to train the next version of models. Good job @AnthropicAI
Anthropic: Our AI agents coded the C compiler 💪🏼 The compiler:
1
64
Bringing the best context engine to your favorite coding agents. Loved building the @augmentcode context engine mcp server. Use it in large mono repos, code migration projects , ci/cd pipelines and code review. You can thank me later!
Today we’re launching Context Engine MCP. After 1,000s of beta testers and overwhelming feedback, it’s now available to anyone.
2
136
Half the world has been scared for no reason about AI taking the jobs. Look at this AI is hiring. And trust me unlike your average hiring manager, it will check your resume properly and talk kindly to you in your interview. Cheers to the future.
I launched rentahuman.ai last night and already 130 people have signed up including an OF model (lmao) and the CEO of an AI startup. If your AI agent wants to rent a person to do an IRL task for them its as simple as one MCP call.
10
Most agent failures are context failures not reasoning failures. Loading every MCP tool upfront is the fastest way to waste tokens. Dynamic tool discovery = smaller prompts, smarter agents. mcplexor.com
8