Cartesia

Cartesia

2 Photos and videos

Tweets

Brandon Yang retweeted

Cartesia

@cartesia

Two new models just dropped 👀 Sonic-3.5 and Ink-2 are the #1 streaming models for text to speech and speech to text

Karan Goel

@krandiash

We released Sonic-3.5 and Ink-2, the #1 streaming models for text to speech and speech to text you can use in your voice agents today. New architectures enable new frontiers for speed and quality. We're now the only provider to have #1 models for both speaking and listening.

2:48

1,900

Brandon Yang

Brandon Yang

@bclyang

44m

We topped the leaderboards for text to speech and speech to text - back to back in a single week! Modeling companies are not about any single model but about building engines for continuous end to end improvement - exciting to see how far we've come!

Karan Goel

@krandiash

2:48

133

Albert Gu

Brandon Yang retweeted

Albert Gu

@_albertgu

Within the span of a week, we launched streaming TTS (text-to-speech) and STT (speech-to-text) models that topped the leaderboards. I'm incredibly proud of the research team for their relentless pursuit of improvement, which have unlocked new state-of-the-art audio models on the Pareto frontier of speed and quality. As a research problem, speech requires fusing both text and audio and is the gateway to general multimodal models. We built Sonic-3.5 and Ink-2 from the ground up, developing multiple innovations along the way in a direction that will scale to general real-time intelligence. I've personally been deeply involved in building these models and more; it's been a blast working with the incredibly talented research team here @cartesia, and I can't wait to show the world what's coming next :)

Karan Goel

@krandiash

2:48

4,797

Aiqi Liu

Brandon Yang retweeted

Aiqi Liu

@MeetAiqi

When @elipughresearch and the team dropped Ink-2, I had to see if this SOTA Speech-to-Text model lived up to the hype. So, I built a dictation app to find out. To my delight, it’s incredibly fast & accurate. You can just "ink it." InkIt is now free and open-source. Try it, fork it, and make it yours! 👇

0:39

104

10,658

Albert Gu

Brandon Yang retweeted

Albert Gu

@_albertgu

May 28

Our new model Ink-2 tops AA's leaderboard for streaming speech-to-text! Ink-2 comes with plenty of features optimized for real-time voice agents. With top-class models for both TTS and STT, the team at @cartesia keeps pushing the frontier of models for interactive intelligence.

Cartesia

@cartesia

May 28

Cartesia Ink-2 debuts as #1 for accuracy on the brand-new streaming speech-to-text leaderboard from @ArtificialAnlys! We designed Ink-2 from the ground up for voice agents - with low latency, eager transcripts, and semantic endpointing.

106

14,600

Eli

Brandon Yang retweeted

Eli @elipughresearch

May 28

🧵 on some fun insider details on ink-2 😼

Cartesia

@cartesia

May 28

6,177

Brandon Yang

Brandon Yang

@bclyang

May 28

Ink-2 is our first streaming ASR model, built specifically for realtime voice agents - and it's #1 on @ArtificialAnlys on day 1! It's rare for models to be #1 on the first try on a new benchmark, since model development is iterative and there's so much that goes into understanding quality. We've seen great results internally and can't wait for everyone to try it!

Cartesia

@cartesia

May 28

885

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

May 22

Sonic 3.5 is now the #1 text to speech model on the @ArtificialAnlys leaderboard! You no longer have to trade off quality and latency - Sonic 3.5 also has the fastest time to first audio at 82ms end to end. See full benchmark results 👇

Artificial Analysis

@ArtificialAnlys

May 22

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500 voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following. Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 ( 16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209 ➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters ➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS See more details and listen to samples below 🧵

10,955

Brandon Yang

Brandon Yang

@bclyang

May 22

Sonic 3.5 is now the #1 TTS model on @ArtificialAnlys, an independent benchmark of TTS quality! It's also the fastest model with 82ms end to end latency - it's always been our dream to build realtime voice with no trade-offs. Building great models comes from getting the fundamental right - infrastructure, architecture, data, and evals - and I'm proud to see the hard work for the team recognized!

Artificial Analysis

@ArtificialAnlys

May 22

5,472

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

May 21

NYC happy hour with @awscloud Thurs, June 4 · 5:30pm · NY tech week RSVP in comments 👇

2,238

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

Mar 18

Mamba-3 is out! 🐍 SSMs marked a major advance for the efficiency of modern LLMs. Mamba-3 takes the next step, shaping SSMs for a world where AI workloads are increasingly dominated by inference. Read about it on the Cartesia blog: blog.cartesia.ai/p/mamba-3

Mamba-3: An Inference-First State Space Model | Cartesia Blog

Mamba-3 reorients state space models around modern inference workloads, improving quality while preserving the low-latency profile that makes linear models compelling.

blog.cartesia.ai

175

73,185

Tri Dao

Brandon Yang retweeted

Tri Dao

@tri_dao

Mar 17

The frontier has increasingly shifted to hybrid models - from Qwen to Kimi-Linear and now with NVIDIA's Nemotron-3 Super - that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date. x.com/_albertgu/status/20339…

Albert Gu

@_albertgu

Mar 17

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

113

842

78,327

Albert Gu

Brandon Yang retweeted

Albert Gu

@_albertgu

Mar 17

310

1,597

447,207

Brian Hie

Brandon Yang retweeted

Brian Hie @BrianHie

Mar 4

Evo 2, our genome language model that generalizes: - across biological prediction and design tasks, - across all modalities of the central dogma, - across molecular to genome scale, and - across all domains of life, is published today in @Nature.

385

60,185

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

Feb 3

Our hackathon kicks off this weekend → $20k in prizes from @AnthropicAI & @cartesia We built the ultimate guide to shipping SOTA voice agents with @ClaudeAI & Cartesia - plus 25 real examples of voice agents taking off 🚀 Reply "Cartesia" or tag a founder/builder for early access👇

2,198

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

Jan 26

🚀Engineers, startup teams, and product-minded operators, this one’s for you! Join @cartesia × @AnthropicAI at @NotionHQ SF for a 2-day hackathon building voice-first AI agents w/ audio, reasoning, memory & action. 🏆 $20k in credits (Anthropic & Cartesia) 💡 Potential Cartesia enterprise subscription ✅ Early access editorial/social boost via partners Link to apply below ⬇️

0:05

6,303

Karan Goel

Brandon Yang retweeted

Karan Goel

@krandiash

24 Nov 2025

The @cartesia team will be at NeurIPS next week, including @_albertgu and me If you'd like to meet with me (or others on the team), fill the form (below) out and we'll figure out how to get in touch

107

22,829

Brandon Yang

Brandon Yang

@bclyang

28 Oct 2025

Excited to share what we've been cooking 🧑‍🍳

Karan Goel

@krandiash

28 Oct 2025

We've raised $100M from Kleiner Perkins, Index Ventures, Lightspeed, and NVIDIA. Today we're introducing Sonic-3 - the state-of-the-art model for realtime conversation. What makes Sonic-3 great: - Breakthrough naturalness - laughter and full emotional range - Lightning fast -

2:20

2,735

Brandon Yang

Brandon Yang

@bclyang

27 Oct 2025

I'll be on stage at Techcrunch Disrupt today talking about Cartesia and the future of voice AI. Come chat!

Cartesia

@cartesia

27 Oct 2025

We’re incredibly honored to be named to the 2025 AI #Disruptors60 by Greenfield Partners. We'll be on stage at TechCrunch Disrupt alongside fellow innovators we respect across both AI infrastructure and applications. Chosen from nearly a thousand applicants, this recognition underscores our work in building the next generation of AI: ubiquitous, interactive intelligence that runs anywhere. We’re excited to continue powering developers with real-time voice models that feel human, respond instantly, and unlock entirely new product experiences. 🚀 Special thanks to our friends, customers, and partners for the continued support! #Disruptors60 #GreenfieldPartners #TCDisrupt #AI

0:04

928

Cartesia

Brandon Yang retweeted

Cartesia

@cartesia

16 Oct 2025

At Cartesia, we’re committed to advancing voice AI for the enterprise. That’s why we’re proud to integrate Cartesia’s state-of-the-art voice AI technology with new @ServiceNow AI Voice Agents, part of the company’s recently announced AI Experience. Read more: bit.ly/servicenowcartesia

10,669