Joined October 2017
822 Photos and videos
You can call a phone number and ask an AI to find you the perfect vinyl based on a mood or memory. We're kicking off our weekly Voice Agent Spotlight with The Record Store Oracle, built by @Gyurmatag, and honestly, the personality in this one sets the bar. Speak a feeling. An era. A specific road trip. The agent listens, thinks, and recommends a record. Under the hood it's a surprisingly clean stack: → @AssemblyAI Voice Agent API (STT LLM TTS in one) → @Twilio Media Streams for real phone call handling → @Cloudflare Workers Durable Objects for hosting → Single WebSocket, no transcoding overhead The UI contains a live orb that pulses as the agent thinks, with a real-time transcript of tool calls happening beneath the surface. New project every Wednesday. More builders, more stacks, more ideas for what voice AI can actually do. 🎙️Demo The Record Store Oracle: voice-agent.cfi-ops.workers.… 🏆Full showcase: assemblyai.com/showcase
5
7
527
AssemblyAI retweeted
AI medical voice agent in production. - 14 hours of call volume - 38 appointments booked, more captured revenue for practices - 11 appointments cancelled -> reduces no show rate and saves cost on valuable chair time Thx @livekit @cartesia @rimelabs @AssemblyAI
1
3
516
This Wednesday afternoon you could have a working voice agent you built yourself. @dan_aai from @AssemblyAI is running a live workshop. Claude Code AssemblyAI Voice Agent API, built from scratch in about an hour. 🤖 Claude Code does the building. You direct it. No coding experience required. Starter code to fork. Q&A the whole way through. free. Jun 10. 10AM PT. → assemblyai.zoom.us/webinar/r…
1
1
485
Everything we shipped in May, in 2 minutes. 🎥 Follow the changelog for more: assemblyai.com/changelog
6
519
Universal-3 Pro just got better across the board. 🚀 Five upgrades, live now: 🌎 Code-switching: ~19% relative WER improvement on multilingual benchmarks 🗣️ Disfluencies: ~5.9% WER improvement on verbatim datasets ⚡ Turnaround time: P50 latency up to 30% faster, P99 up to 34% faster 👥 Diarization: 19% relative improvement ⏱️ Timestamps: 15% precision on English, 58% at P99 for non-English Already on Universal-3 Pro? You're getting all of this automatically. New here? Full breakdown of what changed and why it matters in the blog. 👉assemblyai.com/blog/u3-pro-m…
2
1
817
AssemblyAI retweeted
Before @AssemblyAI, @dylanjfox was teaching himself ML from textbooks at night. I sat down with Dylan on Skywatch, @getbluejay_ai's car podcast. A few things that stuck with me: STT is not transcription. It is an intelligent listening layer. Nobody using voice AI cares about the architecture. They just want it to work. In ten years, people might actually prefer robotic-sounding voice agents. YouTube and Spotify in the replies below!
2
4
5
1,695
Ryan Johnson's first question about Universal-3 Pro Streaming was "why is it so good?" So @ryanseams showed him, trackside at the Miami Grand Prix, with names, emails, and phone numbers flying and F1 cars passing by. @CallRail chose to partner with AssemblyAI so their team can keep shipping the products that matter to their customers, faster. And we can keep building industry-leading Voice AI. And taking it to F1 races.
2
1
5
943
Bad news: yet another Friday with no F1 race on the calendar. Good news: our team was at the Miami GP last weekend putting Universal-3-Pro Streaming through its paces—code switching, numbers, and engine and crowd noise. The conditions were... not ideal. That was the point. See how our latest streaming model held up 👇
2
657
Calling multiple LLM providers in production shouldn't mean juggling separate accounts, bills, and rate limits—and one provider outage taking your whole product down with it. Our LLM Gateway just got a significant upgrade so you can: 🔹 Route across providers with automatic fallbacks 🔹 Stream responses in real time with tool calling 🔹 Get structured JSON from Claude 4.5 models 🔹 Access Qwen 3 and Kimi K2.5 from Moonshot AI 🔹 Cut cost with prompt caching
1
3
689
One OpenAI-compatible endpoint. Zero markup on provider costs. Same AssemblyAI API key you already have. If you're building voice agents on our STT, there's no extra network hop—speech to LLM to action in one system.
1
468
Ask a research question out loud. Under 60 seconds later, you have a complete, sourced answer. We built a reference architecture with @Render using AssemblyAI's Voice Agent API Render's new Workflows. Core insight: keep the voice channel separate from background orchestration. Don't block audio waiting on tool execution. The stack: → @AssemblyAI Voice Agent API handles real-time audio streaming → @render Workflows classify, plan, search, synthesize as isolated tasks → @mastra agents classify question "shapes" before searching → @youdotcom powers parallel search branches Repo includes the Render Blueprint, Mastra configs, and a live demo. Full tutorial source code: assemblyai.com/blog/voice-ag…
2
3
10
3,373
Today we're shipping a major upgrade to streaming diarization, and it pulls us decisively ahead of the competition on the metrics that matter in production. Head-to-head vs. the competition: 🎯 2x better cpWER on 2-speaker telephony 📊 13% better cpWER on 4-speaker meetings 🔇 42% fewer false-alarm speakers 👻 91% fewer phantom turns and words attributed to speakers who don't exist For an AI notetaker, the 91% reduction in phantom-speaker words is the difference between a clean transcript and one your customers have to hand-correct. For an agent-assist tool, it's the difference between coaching prompts based on what the customer actually said and prompts generated from words the customer never spoke. We also updated the API: every word object now carries its own speaker label, unlocking mid-turn speaker change detection at the word boundary instead of the turn boundary. ✅ Live today. Learn more: lnkd.in/eCagsaia 👈
1
3
587
A voice agent. One prompt. Under 15 minutes. That's what Mart built using the AssemblyAI Voice Agent API and Claude Code—and we captured the whole thing on video. Here's what the build actually looked like: 🔹 Install the AssemblyAI MCP server → docs auto-inject into your Claude Code session 🔹 Drop one prompt describing your agent → Claude Code writes frontend and backend 🔹 Deploy to Railway → authenticate via backend token (no exposed API keys) 🔹 Add tool calling with Exa Search for source-backed responses 🔹 Let users pick from AssemblyAI's full voice library at session start If you've been sitting on a voice agent idea, this is the fastest path from concept to production we've seen. Watch the full build-along 👇 youtube.com/watch?v=E6AZhCBw… If you try it—drop your favorite voice in the comments. Our team wants to know. 🎙️
2
5
983
Introducing the Voice Agent API. One WebSocket. Stream audio in, get audio back. We handle the full voice stack so you can focus on your product. Powered by Universal-3 Pro, our speech model built for real-world audio. $4.50/hr. No SDK. Ship today → assemblyai.com/voice-agent
2
2
8
2,083
AssemblyAI retweeted
Dylan Fox, founder and CEO, @AssemblyAI. All talk. All action. May 6 in San Francisco. Apply now to join us: cerebralvalleyvoice.com
1
3
4
632
Vibe coding just leveled up. We brought voice mode to Claude Code using AssemblyAI's Universal-3 Pro Streaming. Why type your prompts when you can just say them? You get insane entity accuracy from AssemblyAI and the full power of Claude Code, all hands-free. Here's the full command: ASSEMBLYAI_API_KEY=[YOUR-API-KEY-HERE] bash -c "$(curl -fsSL assembly.ai/voice)" And get a free API key from your dashboard: assemblyai.com/dashboard/api… Enjoy! 😎🎙️🎧
5
874
Built with AssemblyAI! 🎙️💙
Hey, I'm open-sourcing Clicky. Go forth into the wild and build the future of education and the future of AI interfaces, my friends. I'm happy to have given a spark. Enjoy! github.com/farzaa/clicky
3
1,112
General-purpose ASR: 95% accuracy on a clinical consult. Also general-purpose ASR: gets "hydrochlorothiazide" wrong every time. Introducing Medical Mode — a correction pass on top of Universal-3 Pro optimized for medical entity recognition. Enable it with one parameter.
2
4
1,138
The real failure mode isn't the transcript. It's what comes next. Most healthcare AI pipelines feed transcripts into an LLM → SOAP notes, discharge summaries, referral letters. Wrong drug name in. Wrong drug name out. Errors don't attenuate. They propagate.
1
624
Medical Mode catches it before it gets that far. Works on both Pre-recorded and Streaming audio. HIPAA BAA included. $0.15/hr. See our benchmarks here → assemblyai.com/medical-mode Test with your own audio → assembly.ai/playground
521