Poseidon

Poseidon

105 Photos and videos

Tweets

Pinned Tweet

Poseidon

@psdnai

May 19

Voice AI has an evaluation problem. Models look strong on public benchmarks, then collapse on real-world audio. Introducing sonar.psdn.ai: a recipe-driven evaluation framework for low-resource languages, real-world audio, and production failure modes. Details ↓

0:11

186

31,689

Story

Poseidon retweeted

Story

@StoryProtocol

Jun 12

Reminder → Numo is now available for speakers of: ▸ Filipino ▸ Indonesian ▸ Malay Contribute → earn.

0:05

182

370,467

Story

Poseidon retweeted

Story

@StoryProtocol

Jun 9

Tamil has reached its contribution goal on Numo. நன்றி to everyone who contributed. This is what compounding momentum looks like.

0:05

143

8,014

Story

Poseidon retweeted

Story

@StoryProtocol

Jun 8

The internet was the first dataset. What's next is being built now. Meet @SPChinchali from @Psdnai↓

1:34

187

11,435

Poseidon

Poseidon

@psdnai

Jun 3

Numo tasks are now available for speakers of: ▸ Filipino ▸ Indonesian ▸ Malay Start contributing today.

0:05

134

248

840,279

Story

Poseidon retweeted

Story

@StoryProtocol

May 29

760k submissions on Numo from 13k contributors in 1 month. Turns out, when you make it simple to contribute to AI, people show up. Still early.

0:05

174

17,153

Poseidon

Poseidon

@psdnai

May 28

Live on Numo: Upload your CV, add your credentials, and give your profile more context. Better profile signal = better task matching.

0:05

156

16,523

Poseidon

Poseidon

@psdnai

May 27

You cannot improve what you cannot measure. LibriSpeech is 1,000 hours of clean English audiobook narration, but real voice products deal with noisy rooms, dialects, code-switching, and non-English speech. That is why SONAR introduces the PSDN Score: a composite metric that combines WER, CER, and semantic similarity to evaluate whether a transcript preserves both the words and the meaning.

Poseidon

@psdnai

May 19

0:11

115

9,519

Poseidon

Poseidon

@psdnai

May 25

The best voice AI model depends on the audio you test it on. In SONAR’s Bengali case study, 8 models were evaluated across 6 datasets using the PSDN Score, a composite metric that blends word accuracy, character accuracy, and semantic similarity. No single model held the top spot across every dataset.

146

9,279

Poseidon

Poseidon

@psdnai

May 25

That matters for vendor selection. A public benchmark winner can fall to the middle on proprietary conversational audio. In this evaluation, several open-source Bengali fine-tuned models were more dataset-stable than commercial APIs. If you evaluate on benchmark audio and deploy on real user audio, the ranking you see may not be the ranking you get.

1,768

Poseidon

Poseidon

@psdnai

May 25

Learn more about SONAR, our evaluation framework for voice AI: psdn.ai/blog/sonar-evaluatin…

SONAR: Evaluating Voice AI Beyond English

An ASR evaluation benchmark for low-resource languages, real-world audio, and production failure modes.

psdn.ai

1,441

Poseidon

Poseidon

@psdnai

May 19

0:11

186

31,689

more replies

Poseidon

Poseidon

@psdnai

May 19

In Bengali, SONAR evaluated 8 ASR models across 6 datasets, producing ~16,000 scored predictions. No single model held the top spot across every dataset. WER-only rankings missed semantic failures. Aggregate scores hid demographic gaps.

1,914

Poseidon

Poseidon

@psdnai

May 19

Voice AI will not improve globally from model rankings alone. Teams need to know where models break, why they break, and what data closes the gap. Explore: psdn.ai/blog/sonar-evaluatin…

SONAR: Evaluating Voice AI Beyond English

An ASR evaluation benchmark for low-resource languages, real-world audio, and production failure modes.

psdn.ai

1,750

Poseidon

Poseidon

@psdnai

May 18

Hindi, Telugu, and Vietnamese all just crossed the finish line on Numo. Huge thank you to everyone who contributed and helped bring more real-world voice data into AI training. Submissions are now closed in these languages.

0:05

168

2,799,609

Poseidon

Poseidon

@psdnai

May 11

We’ve officially reached our target for Bengali voice data contributions. To everyone who participated in this category: ধন্যবাদ! Thank you for helping build the next generation of AI systems.

0:05

189

19,060

Poseidon

Poseidon

@psdnai

May 11

Voice tasks are still open in: Hindi Tamil Telugu Vietnamese Get started today: numolabs.ai

Numo — Contribute Data. Earn Rewards. Power AI.

Numo turns your moments into valuable AI training data. Record voice samples, complete tasks, and earn rewards.

numolabs.ai

3,184

Story

Poseidon retweeted

Story

@StoryProtocol

May 7

Week one Numo stats: ▸ 210,000 contributions ▸ 18,000 contributors ▸ Contributions growing ~40% day over day Real voices, real fast. Onward.

Andrea | Devrelius

@devrelius

May 7

seven days ago we launched Numo with @psdnai. a billion people speak hindi, bengali, tamil, telugu. less than 0.1% of voice ai training data is in any of them (like in microsoft's VibeVoice latest model). nobody was fixing this big training gap, so we started collecting thousands of hours from contributors all over india. week one stats: > 210,000 contributions > 18,000 contributors > contributions growing about 40% day over day every contribution registered as IP. every contributor rewarded. every second of audio licensed at the moment it's create on story. Vietnamese launched just hours ago and we already have thousands of contributions! here's a some fun facts about why we are focusing on Vietnamese: Vietnamese has six tones across three regional dialect systems: Northern, Central, Southern, with big differences in tone systems, vocabulary, and pronunciation Central Vietnamese is the most different and it’s the most challenging dialect, even for other Vietnamese speakers Most voice AI is trained on Northern Vietnamese which misses roughly half of how the country actually speaks thanks to Numo's collection efforts AI models will be able to upskill on conversational nuances and be more realistic. we're also not stopping at voice. we'll start rolling out other forms of tasks soon, always focused on long tail, hard to scrape real world data the models can't easily scrape from the internet.

175

20,271