Jon

Jon

Users
Tweets

19h

Using @augmentcode to help build a little AI Research Lab where I can experiment with different models, evaluation methods, use cases and more. Very excited to see what I can come up with.

CatGod

CatGod

@CatGodSandHive

Jun 13

Replying to @augmentcode

Join the Telegram dev chat if you're working on the fallback flow to discuss adding safe checkpoints for forced model swaps!

Alice The Ai Expert

Alice The Ai Expert

@AliceInfoAi

Jun 13

Replying to @augmentcode

Cosmos auto fallback to Opus 4.8 kept workflows running without disruption

112

𝗢𝗠𝗘𝗚𝗔

𝗢𝗠𝗘𝗚𝗔

@poly_enjoyer

Jun 13

Replying to @augmentcode

regulatory pressure comes faster than adoption

Jessica Jones

Jessica Jones

@jessicajonesxt

Jun 12

Replying to @adityagrover_ @baseten @nvidia @augmentcode

Yes🙌

Jeff Farmer

Jeff Farmer

@JeffW008

Jun 12

Replying to @NVIDIAAI @_inception_ai @augmentcode @baseten

Nvidia = the king of all semi; $NVDA = the dog of all semi, being suppressed in a narrow range for more than one year, underperforming all market indices...

Alexander James

Alexander James

@siralexanderj

Jun 12

Replying to @_inception_ai @NVIDIAAI @augmentcode

A 90% cost reduction and 82% latency drop is insane for production-level AI. This partnership with Baseten is going to make high-speed reasoning models way more accessible for developers. Impressive stats!

Aishwarya Goel (Ash)

Aishwarya Goel (Ash)

@aishwarya_08

Jun 12

Replying to @adityagrover_ @baseten @nvidia @augmentcode

Yayyyy! Excited to see this finally happening. Let’s get Mercury everywhere 🚀🚀🚀🚀🚀🚀

214

NVIDIA AI

NVIDIA AI

@NVIDIAAI

Jun 12

Replying to @_inception_ai @augmentcode

🔥 Congrats to the @_inception_ai and @baseten teams!

1,442

rikllo

rikllo @rikllo

Jun 12

Replying to @_inception_ai @NVIDIAAI @augmentcode

When mercury 3?

agrim singh

agrim singh

@agrimsingh

Jun 12

Replying to @adityagrover_ @baseten @nvidia @augmentcode

hell yeah!!!

445

Aditya Grover

Aditya Grover

@adityagrover_

Jun 12

Replying to @ShreyaR @baseten @nvidia @augmentcode

thanks @ShreyaR!

132

Kumar Chellapilla

Kumar Chellapilla

@kumarc1

Jun 11

I am excited to announce that Mercury 2 is now live on Baseten. Modern AI applications are evolving into multi-model agentic systems. The components that handle planning, routing, searching, classifying, and compacting must be fast, intelligent, and token efficient. Mercury 2 is designed for this purpose, achieving over 1,000 tokens per second on NVIDIA GPUs. AugmentCode is already leveraging Mercury 2 in production, resulting in a 90% reduction in costs and an 82% decrease in latency. For more details, check out the blog post: x.com/baseten/status/2065099….

Baseten

@baseten

Jun 11

x.com/i/article/206508590334…

Latent Local

Latent Local @latentlocal

Jun 11

Replying to @_inception_ai @NVIDIAAI @augmentcode

Nice! Been using Mercury 2 for a while. How can you not love that speed?

Latent Local @latentlocal

May 21

People are sleeping on @_inception_ai ‘s Mercury 2. You don’t need a SOTA model for every task and the speed here is 👌 Looking forward to anything the team does.

153

Volodymyr Kuleshov 🇺🇦

Volodymyr Kuleshov 🇺🇦@volokuleshov

Jun 11

Today Mercury 2, the first reasoning diffusion LLM, is live on Baseten. The result: over 1,000 tokens per second on standard NVIDIA GPUs, at comparable quality to speed-optimized models. @AugmentCode is already using it in production, cutting cost 90% and latency 82%.

Baseten

@baseten

Jun 11

We are excited to announce that we have partnered with @_inception_ai to make Mercury 2 available on Baseten. This makes us the first inference platform to bring Inception’s diffusion LLM to production. Inception’s dLLM architecture fixes the bottlenecks of sequential token generation and can deliver 1,000 tokens/sec on standard NVIDIA GPUs. Early users like @augmentcode have seen impressive results, such as an 82% reduction in latency and 90% cost savings, while maintaining high quality.

2,039

Baseten

Baseten

@baseten

Jun 11

Replying to @_inception_ai @NVIDIAAI @augmentcode

Inception 🤝 Baseten

266

shreya rajpal

shreya rajpal

@ShreyaR

Jun 11

Replying to @adityagrover_ @baseten @nvidia @augmentcode

congrats!!

847

Inception

Inception

@_inception_ai

Jun 11

The fastest reasoning LLM is now in production on Baseten. Mercury 2 is a diffusion LLM, so it generates tokens in parallel and hits 1,000 tokens/sec on @NVIDIAAI GPUs, speeds that used to require specialized hardware. @augmentcode is already using Mercury 2, cutting cost 90% and latency 82%. Proud to partner with the @baseten team to bring dLLMs to production.

Baseten

@baseten

Jun 11

113

12,026

Aditya Grover

Aditya Grover

@adityagrover_

Jun 11

Today we're bringing Mercury 2 to @Baseten. Mercury 2 delivers over 1,000 tokens per second for customers on @NVIDIA GPUs with the reliability and scale enterprise teams need. Read more to see how @augmentcode is using Mercury 2 in production reducing costs by 90% and latency by 82%. More customer stories across coding agents, real-time voice, and enterprise search dropping soon.

Baseten

@baseten

Jun 11

x.com/i/article/206508590334…

6,152

Baseten

Baseten

@baseten

Jun 11

Baseten

@baseten

Jun 11

x.com/i/article/206508590334…

22,804