@MIT alum || Former Bridgewater Assoc. || Founder/CEO of @Modulate_ai

Joined May 2014
13 Photos and videos
Michael Pappas retweeted
Most voice pipelines still look like this: audio → transcribe → text → model → act The problem is step one. The second you turn audio into text, you throw away tone, hesitation, sarcasm, stress. The signal that told you what the person actually meant. So everything you built after that ran on the transcript. The actual conversation was already gone. @modulate_ai trained Velma on 550M hours of raw audio to skip that step entirely. One model that listens to the audio instead of reading a summary of it. #1 on the conversation understanding benchmark. 10x cheaper than running it through an LLM. The part that makes this more interesting: Velma has already been running in production inside Call of Duty, GTA Online, and Fortune 500 contact centers. Now the API is open to everyone.
The world's first audio-native AI model is now available as an API. Velma listens and understands like a human — emotions, tone, intent, rhythm, vocal stress. Already analyzed 550M hours of conversation for Fortune 500s. Now open to developers. 🧵
8
7
53
12,270
Hot take: most voice AI isn't actually understanding speech. It's reading a transcript of it. There's a meaningful difference. And it's why voice pipelines keep failing at the moments that matter most. Dropping something tomorrow that takes a very different approach 👀
2
2
19
You've spent hours tuning your voice pipeline. Better STT model. Cleaner NLP. More labels. And it still misses the call where a customer was clearly about to churn. The problem isn't your implementation. It's the architecture. Something different is coming @modulate_ai 👀
3
112
Michael Pappas retweeted
95% of enterprise AI deployments fail. That’s not a tooling problem. It’s a design problem. In this clip, our CEO @mpappas74 breaks down why most AI products never make it past the demo stage - and what businesses actually need instead. Not another “AI employee” to manage. Tools that fit into real workflows and solve specific problems. At @modulate_ai, that’s the lens we build through.
2
1
2
152
do you remember when you joined X? i do! #MyXAnniversary
1
16
Michael Pappas retweeted
There’s an AI for transcription. 🦾 taaft.co/modulate if you’re building with voice, details matter: Grok STT: $0.10/hr → transcripts only @Modulate’s Velma: $0.03/hr → 14.9% WER → emotion detection → accent detection → PII redaction Test it yourself...👇
1
1
8
8,690
Voice fraud isn’t just a security problem. It’s a massive, ongoing cost center. In this video, I break down what it’s *actually* costing businesses today - and it’s more than most people realize 👀 There are two layers to it: 1. Direct losses $$$ When voice fraud hits, the damage can be immediate - and in some cases, reach hundreds of millions. Often unrecoverable. 2. The cost of trying to prevent it $$$$ Even if you’re never breached, you’re still paying: - Added authentication friction that slows down users - Frustrated customers - and lost revenue - Teams tied up auditing calls, running investigations, and handling compliance All of that adds up to tens of millions in ongoing operational cost. So the real question isn’t “what happens if we get hit?” It’s “how much are we already spending because this risk exists?”
1
1
22
If you’re dealing with this today, test how we're approaching a better way to solve this problem: modulate.ai/deepfake-detect?…

9
“Can’t my LLM provider just solve this too?” I hear this all the time - and I think it’s the wrong question. Because in practice, the more general a system tries to be, the harder it is to trust for any specific task. We’ve seen this before with software. Specialization wins when reliability matters 🧵
1
1
16
So here’s what I’m curious about: Are you betting on generalist AI to handle critical workflows? Or are you moving toward more specialized systems you can actually control and rely on? I share what I think in the video - and why we’ve taken a different approach at @modulate_ai
19
Michael Pappas retweeted
AI regulation is solving the wrong problem. Right now, most policies are built around generative AI: models like ChatGPT that create content (and yes, can hallucinate) But that’s only half the picture. There’s another category: analytic AI. Systems designed to understand what’s happening and return fixed, verifiable answers - no guessing, no hallucinations. In this clip, our CEO @mpappas74 breaks down why treating both the same is a mistake - and how current regulations are unintentionally slowing down tools that don’t carry the same risks. At Modulate, this distinction is core to how we build. Because not all AI should be regulated like it makes things up 👀
1
3
399
Michael Pappas retweeted
The AI playbook says: more data, more compute, bigger models. We don’t buy it. At Modulate, this is how we think how we build. In this clip, our CEO @mpappas74 breaks down why focused data real insight beats brute force. We’re a team of ~40, and that approach has led to: - Transcription models outperforming @OpenAI on accuracy - Deepfake detection models topping the @huggingface speech arena leaderboard Not by hoovering the internet. By using the right data. Because better > bigger.
2
2
164
Excited to share this! @hackernoon
We’re @hackernoon Company of the Week Voice AI breaks down when things get real: messy audio, emotion, overlap, intent. So we built Velma - the first Ensemble Listening Model, trained on 550M hours of real-world audio, designed to understand speech as it actually happens (not sanitized benchmarks). And ToxMod - real-time voice moderation that detects how something is said, not just the words. This isn’t research. It’s deployed at scale across Fortune 500 platforms today 🌐 Voice is the hardest problem in AI. It’s also the most human. We’re building the infrastructure to make it work -safely, accurately, in real time.
57
Michael Pappas retweeted
Deepfakes are getting cheaper to create. Detection shouldn’t be 100x more expensive. We just flipped the equation. Stop deepfakes for $0.25/hr - now at cost parity with creation ⚖️
1
2
2
480
Everyone wants a bigger model. That’s the wrong direction. We keep getting asked how @modulate_ai compares to LLMs like ChatGPT. Honest answer: we’re solving a different problem. For real-world business use, one giant “black box” model creates more problems than it solves: – hard to tune – tough to make compliant – inconsistent when it matters most Businesses don’t need better conversation. They need precision - numbers, probabilities, outcomes they can trust. So we built differently: → 100 specialized models → each trained for a specific signal → working together to analyze conversations in a structured, reliable way Less flashy than a single massive model. Way more predictable. Way more useful. Breakdown in the video 👇
197
this is the pattern we’re going to see everywhere: phase 1: “wow it runs itself” phase 2: “wait why is it doing that” phase 3: add constraints, memory, and oversight until it resembles… a well-instrumented system agents aren’t magic. they’re just very fast optimizers with incomplete context. the winners won’t be the ones who deploy agents first - they’ll be the ones who define the right objectives and catch failures early.
SOMEONE PUT AN OPENCLAW-RUN VENDING MACHINE IN SAN FRANCISCO an AI agent is running an actual physical vending machine OpenClaw decides what to sell, how to name the products, how to price them, creates the ads, and tracks all the sales you can even see a dashboard of all the sales that the AI vending machine made the vending machine hardware does the dispensing. the AI does everything else, and of course inventory is supplied by the guy who runs it it's installed at Frontier Tower in SF which is a building packed with AI and robotics startup founders the agent forgot things, hallucinated, and at one point raised prices way too high. then tried to justify it because people were still buying we are now living in a simulation.
1
5
222
Michael Pappas retweeted
Something didn’t add up… so we ran an experiment. Deepgram claims 5.26% WER. We got 28.1% on real-world audio. Not a rounding error. A reality check. Most ASR models are tested on clean, perfect data. But real conversations are messy. So we tested where it actually matters. @modulate_ai came out ~2x more accurate. Full breakdown ↓ bit.ly/4mpyNFG
1
2
5
315
Michael Pappas retweeted
Paying $$$$/hr to catch voice deepfakes? Cute. Meet Velma Deepfake Detect - Modulate’s synthetic voice detection API. 98.9% accuracy, 🥇 ranked top on @huggingface Speech Arena, and 120× cheaper than next best model providers. Time to stop getting ripped off ↓
22
21
50
13,891