Ben Carr

Ben Carr

14 Photos and videos

Tweets

Pinned Tweet

Ben Carr

@benatanam

Feb 17

Introducing cara-3, the fastest real-time avatar model on the market. Cara model delivers unmatched realism with sub-180ms response times, setting a new industry standard. 70% of users prefer video over voice. Every pixel is generated in real time, unlocking natural eye movement, micro-expressions, and emotional subtlety so each conversation feels real. Comment "CARA" for 500 free credits.

2:08

500

134

1,222

405,737

Ben Carr

Ben Carr

@benatanam

May 15

it appears our latest model has a hidden jack nicholson dimension 🔪

Harry Coultas Blum

Ben Carr retweeted

Harry Coultas Blum

@harrycblum

May 14

Open source Jarvis that runs on a single GPU Today we're releasing the vui stack. A local voice agent that you can chat with in real time, with tools and can run claude to do more complex tasks. Inside this stack is the new vui nano model, a 300M TTS model that can render audio in reply to what you've said and supports a variety of non speech sounds. vui nano speaks with you, not at you. The stack can run on as little as 6GB of vram. Voice cloning supported with prompts of up to 5 minutes. The longer the better. A voice for your openclaw with our v1/realtime endpoint. I have developed this on my own so would love to get the communities feedback and help improving it. Please retweet this so that everyone knows they can have their own private Jarvis

7:31

1,854

Ben Carr

Ben Carr

@benatanam

Apr 30

.@getstream_io x @anam__AI is live: add interactive avatars using the @visionagents_ai framework Read more: anam.ai/blog/vision-agents-a…

3:01

179

Ben Carr

Ben Carr

@benatanam

Apr 23

Anam is now integrated with Stream’s Vision Agents 🤙 Stream gives you the realtime multimodal agent framework: calls, state, orchestration, audio/video pipeline. Anam now gives the agent a live avatar in the call. This setup opens the door for "scene switching". The avatar starts on a neutral background, then changes based on the conversation: * ask for a recipe → kitchen * ask about weather → studio * next user turn → back to neutral It’s a relatively small thing technically: intercept the Anam video frames, chroma-key the green screen, and swap in a background based on tool calls / transcript callbacks. But it changes the feel a lot. The agent isn’t just talking over video, its environment can react too. Thanks to the Stream team for leading on the integration! docs: anam.ai/cookbook/vision-agen… cc @visionagents_ai @neevash @Anam__ai

318

Zhengyao Jiang

Ben Carr retweeted

Zhengyao Jiang

@zhengyaojiang

Apr 2

Is autoresearch really better than classic hyperparameter tuning? We did experiments comparing Optuna & autoresearch. Autoresearch converges faster, is more cost-efficient, and even generalizes better: 🧵(1/6)

115

1,307

136,070

Anam.ai

Ben Carr retweeted

Anam.ai

@Anam__ai

Mar 17

Huge fans @lennysan and @bcherny at Anam. Enjoyed Boris's recent episode on Lenny's Podcast on the future of coding with AI so much we put together a demo of adding an Anam face to Claude Code.

2:58

290

Ben Carr

Ben Carr

@benatanam

Mar 12

~15% of users hit unstable connections during interactive avatar sessions. Most never told us. The session just quietly got worse. We shipped adaptive bitrate. Every Anam session now adjusts to network conditions in real time.

196

more replies

Ben Carr

Ben Carr

@benatanam

Mar 12

This is table-stakes infrastructure for real-time platforms like Agora and LiveKit. Now it runs on every session. The difference: conversations that used to stutter or freeze now stay smooth. Sessions run longer. Users don't bounce.

Ben Carr

Ben Carr

@benatanam

Mar 12

Small change in the stack, big change in the 15% of sessions that needed it most.

Ben Carr

Ben Carr

@benatanam

Mar 10

Anam is part of the relaunched AI Startup Pack by Fin. We're in good company, alongside @ElevenLabs, @Cloudflare, @incident_io, @Attio, and more. Build with interactive avatars that respond in real time, look realistic, and deploy via API. No upfront cost for 7 months.

366

Ben Carr

Ben Carr

@benatanam

Mar 10

@ElevenLabsDevs ^

Ben Carr

Ben Carr

@benatanam

Mar 10

Replying to @Cloudflare @incident_io @attio

Check it out here: fin.ai/startup-pack

Fin Startup Pack — Build your startup with the best AI tools, free credits and exclusive perks

Apply to the Fin Startup Pack for early-stage startups. Unlock $500k in software credits and exclusive perks from Intercom, Datadog, Atlassian, Stripe, and more.

fin.ai

Ben Carr

Ben Carr

@benatanam

Mar 5

Our pipecat contribution just got merged. Anam is now listed as an official community video integration in the pipecat ecosystem. In case you don't know, pipecat is Daily's open-source framework for building real-time voice agents. 10.5k GitHub stars, used by NVIDIA, Mercor, Descript. We built a video service that takes TTS audio from the pipeline, streams it to Anam over WebRTC, and returns a synchronized interactive avatar face in real time. The avatar speaks, reacts, handles interrupts natively.

168

Ben Carr

Ben Carr

@benatanam

Mar 5

pip install pipecat-anam Repo and working example: github.com/anam-org/pipecat-… Thanks @kwindla and team @pipecat_ai

GitHub - anam-org/pipecat-anam: Anam video avatar service for Pipecat

Anam video avatar service for Pipecat. Contribute to anam-org/pipecat-anam development by creating an account on GitHub.

github.com

Ben Carr

Ben Carr

@benatanam

Feb 17

Today we're releasing cara-3, our latest face-generation model. In an independent blind study, participants preferred Anam's interactive avatars over other providers across every metric measured. But why do we care about avatars to begin with? x.com/BenCarr630567/status/2…

Ben Carr

@benatanam

Feb 17

x.com/i/article/202375077308…

3,673

Ben Carr

Ben Carr

@benatanam

Feb 17

x.com/i/article/202375077308…

5,456

Ben Carr

Ben Carr

@benatanam

Feb 14

We're open sourcing the backbone to our data pipeline. It's called Metaxy and it solves some of the hardest parts of a modern, scalable pipeline. At Anam, we’re building a platform for real-time avatars. One of the core components powering our product is our own video generation model. We train it on custom datasets that require extensive preprocessing of video and audio data. We extract embeddings using ML models, rely on external APIs for annotation and data synthesis, and orchestrate complex multimodal pipelines. Along the way, we ran into significant challenges implementing efficient and flexible sample-level versioning (caching) across these workflows. That experience led us to build and open-source Metaxy — a framework for metadata management and sample-level versioning in multimodal data pipelines. One of our engineers, Daniel, has been working tirelessly on Metaxy for the past few months, investing a staggering amount of time into it both during and outside of work. It now powers our data preparation pipelines and has made life significantly easier for our research team. docs: docs.metaxy.io blog post: anam.ai/blog/metaxy

589

Ben Carr

Ben Carr

@benatanam

Feb 12

Anam now has a Python SDK github.com/anam-org/python-s… What's in the box? - webrtc media handling; connect and get synced audio/video frames back - full pipeline (STT → LLM → TTS → Face) or bring your own components - live transcriptions from user and avatar, useful for captions or logging - async-first; process frames with async iterators, hook into events with decorators This brings it close to feature-parity with our JS SDK. Best thing though is you don't need a browser anymore...

0:18

589