@trydaily @pipecat_ai. ex-CEO @callstatsio acq’d by $eght. earlier multimedia protocols and video. Focus on growth, revenue. 🇺🇸🇫🇮🇮🇳

Joined August 2007
344 Photos and videos
A day with Fable 5 running a task and I started to hear my computer speak and play tones. I had asked it to debug an audio capture issue. In the past, Opus would either ask me to run the Smoke test wire up the devices since it required a human to test. And if this was repetitive, I would ask it to make a script or similar to do the test. But this I saw it take the initiative to do the analysis by itself. This was on a new project and I heard it say "Samantha" then a few tones, a period of silence and this continued for an hour. I spun up another claude to ask it to analyse the JSONL and understand why it was playing tones. It said: "to prove the Core Audio tap was actually capturing system audio, the test played a known audible signal and checked the captured PCM peak amplitude." This was a pipecat example with a menu bar item and nemotron-3.5-asr capturing mic and system audio. Fable wrote the menubar, coreaudio bindings, pipecat reads from these sockets and sends it to the ASR and SmartTurnModel to write full semantic sentences into a VTT-style file. One more step in loop engineering unlocked, on to the next.
1
106
We all want to run a personal assistant somewhere. @signalgaining invited me to do a quick demo of @pipecat_ai on a jetson nano and running local model from voxtral and kokoro. Cant wait to see pipecat and voice ai in the robots!
Want to build your own Alexa, Siri, or Google with an NVIDIA Jetson? Get started today with `wendy init --template` wendy.dev/blog/voice-ai-agen…
1
4
167
Varun Singh retweeted
Here's a simple loop: Tell codex to maintain your repos, wake up every 5 minutes and direct work to threads. That makes it easy to parallelize steer work as needed. I use a orchestrator skill combined with my triage autoreview computer use skills, so some work can land autonomously. github.com/steipete/agent-sc… github.com/steipete/agent-sc…
200
428
5,094
508,882
Ah back channeling is a such a good trait. And one I’d the reasons, developers even consider speech models. This adds to the naturalness of speech. Love this update, can’t wait to try it. Perhaps a small model like Smart Tutn could consider back channeling hints for the cascade pipeline to consider.
New paper: Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models We use RL to post-train speech models (Moshi and PersonaPlex) to talk more like a human: to know when to respond, when to wait, and when to nod along with “yeah”s and “okay”s when listening.
1
39
Varun Singh retweeted
One of my personal favorite features announced at WWDC will I suspect be a sleeper hit: container machines, allowing your Mac to run a lightweight, persistent Linux environment with your home directory and repos automatically mounted: github.com/apple/container/b…
227
815
9,698
729,544
Varun Singh retweeted
Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.
512
1,791
15,546
5,465,061
Lots of interesting projects built at the multi-agent hackathon. Spam call detection and advisory, MIDI generation, auto bug or vulnerability detection with proper disclosure for bug bounties, an auto training based on loss spike detection and many more projects built in 24h! Thanks @altryne for the invitation and opportunity to judge the work!
1
54
Varun Singh retweeted
thank you for having us, such a good crew!! ❤️
1
4
274
Varun Singh retweeted
Jun 5
3 weeks left til @aidotengineer world's fair! if you want to get on this year's map of top ai engineering companies, theres a few spots left we are sold out of: - presenting sponsors - model lab sponsors - platinum sponsors - gold sponsors the big spots left are for the official afterparties - welcome reception, networking night, and world cup quarterfinal bundles if interested - drop a note to sponsorships@ai.engineer detailing size/scale of interest (below is 2025, we are recruiting 2026 now) if you are attending - BOOK YOUR HOTELS BY TMR AS THE ROOM BLOCK DISCOUNT EXPIRES TMR
31
6
99
22,283
Come hang out and build voice agents, I will be there both days in pipecat swag
This weekend, join us in SF for our 4th WeaveHacks hackathon! Sponsored by @OpenAIDevs for the first time ( @dkundel judging!), @cursor_ai ,@Redisinc and @CopilotKit , Hackers will get over $150 in credits to build multi-agent orchestration systems Over $15K in prizes!
2
1
4
3,663
Varun Singh retweeted
🚀 Gemma 4 12B is here! We partnered with @GoogleDeepMind to bring and optimize their new dense and unifed multimodal model for Apple Silicon. ◈ 12B dense · 256K context ◈ Thinking mode (built-in reasoning) ◈ Vision: dynamic res, OCR, UI charts ◈ Native audio: ASR speech translation ◈ Function calling for agents ◈ Text image audio, interleaved Runs local. Get started now ⚡ > uv pip install -U mlx-vlm github.com/Blaizzy/mlx-vlm
Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇
52
143
1,409
178,144
Varun Singh retweeted
Excited to announce Slashy The first email client that works for you. The real cost of email isn't the time. It's the mental load of constantly checking it, just in case something needs you. Slashy kills that. You never need to open your inbox unless Slashy tells you. Try it out at slashy.com
80
34
264
57,530
Varun Singh retweeted
We're hosting a fireside chat on Voice AI with Basia Sudol (Head of Enterprise Solutions at @DecagonAI), Sudarshan Kamath (Founder at @smallest_AI ), Varun Singh (CPTO at @trydaily), Steven Diaz (FDE Manager at Vapi), and Tyler D'Silva (Founding FDE at @retellai) next Thursday, June 4th! Don't miss it: luma.com/fmpyy1b6
2
6
1,418
Varun Singh retweeted
May 28
"Developers can update Claude’s instructions mid-task without breaking the prompt cache or routing the update through a user turn" wtf? how??
May 28
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.
82
15
846
138,821
Question for Hermes builders: if a workflow needs multiple subagents and by extension different models (e.g. Gemini vs Claude vs Codex, or different ‘thinking levels’ like Sonnet vs Opus, x-high vs low), is the trivial approach to spin up separate Hermes agents per model? I’ve embedded model hints in skills, but to swap providers inside Hermes I still need separate Hermes instances, right? Or asked in another way -- if I want to mix providers within Hermes, does that still imply multiple Hermes instances, or is there a better abstraction?
1
111
Varun Singh retweeted
Imagine a local agent where cache misses don't exist, tools don't need translations, you see progress for prefill, tokens are emitted ASAP.
26
29
436
41,450
Varun Singh retweeted
May 15
You can now use your @grok subscription inside @NousResearch Hermes Agent. x.ai/news/grok-hermes
559
595
5,539
3,966,882
Varun Singh retweeted
Marking this as a moment convincing @swyx to bring @aiDotEngineer to India next year with @sanjeed_i @udayan_w Exciting times!! 🥳
8
3
81
40,483