Real-world data for physical AI

Joined July 2025
105 Photos and videos
Pinned Tweet
May 19
Voice AI has an evaluation problem. Models look strong on public benchmarks, then collapse on real-world audio. Introducing sonar.psdn.ai: a recipe-driven evaluation framework for low-resource languages, real-world audio, and production failure modes. Details ↓
40
34
186
31,689
Poseidon retweeted
Reminder → Numo is now available for speakers of: ▸ Filipino ▸ Indonesian ▸ Malay Contribute → earn.
54
31
182
370,467
Poseidon retweeted
Tamil has reached its contribution goal on Numo. நன்றி to everyone who contributed. This is what compounding momentum looks like.
52
28
143
8,014
Poseidon retweeted
The internet was the first dataset. What's next is being built now. Meet @SPChinchali from @Psdnai
64
51
187
11,435
Numo tasks are now available for speakers of: ▸ Filipino ▸ Indonesian ▸ Malay Start contributing today.
134
35
248
840,279
Poseidon retweeted
760k submissions on Numo from 13k contributors in 1 month. Turns out, when you make it simple to contribute to AI, people show up. Still early.
77
39
174
17,153
May 28
Live on Numo: Upload your CV, add your credentials, and give your profile more context. Better profile signal = better task matching.
37
25
156
16,523
May 27
You cannot improve what you cannot measure. LibriSpeech is 1,000 hours of clean English audiobook narration, but real voice products deal with noisy rooms, dialects, code-switching, and non-English speech. That is why SONAR introduces the PSDN Score: a composite metric that combines WER, CER, and semantic similarity to evaluate whether a transcript preserves both the words and the meaning.
May 19
Voice AI has an evaluation problem. Models look strong on public benchmarks, then collapse on real-world audio. Introducing sonar.psdn.ai: a recipe-driven evaluation framework for low-resource languages, real-world audio, and production failure modes. Details ↓
23
17
115
9,519
May 25
The best voice AI model depends on the audio you test it on. In SONAR’s Bengali case study, 8 models were evaluated across 6 datasets using the PSDN Score, a composite metric that blends word accuracy, character accuracy, and semantic similarity. No single model held the top spot across every dataset.
29
24
146
9,279
May 25
That matters for vendor selection. A public benchmark winner can fall to the middle on proprietary conversational audio. In this evaluation, several open-source Bengali fine-tuned models were more dataset-stable than commercial APIs. If you evaluate on benchmark audio and deploy on real user audio, the ranking you see may not be the ranking you get.
2
1
26
1,768
May 19
Voice AI has an evaluation problem. Models look strong on public benchmarks, then collapse on real-world audio. Introducing sonar.psdn.ai: a recipe-driven evaluation framework for low-resource languages, real-world audio, and production failure modes. Details ↓
40
34
186
31,689
May 19
In Bengali, SONAR evaluated 8 ASR models across 6 datasets, producing ~16,000 scored predictions. No single model held the top spot across every dataset. WER-only rankings missed semantic failures. Aggregate scores hid demographic gaps.
1
1
20
1,914
May 19
Voice AI will not improve globally from model rankings alone. Teams need to know where models break, why they break, and what data closes the gap. Explore: psdn.ai/blog/sonar-evaluatin…
1
23
1,750
May 18
Hindi, Telugu, and Vietnamese all just crossed the finish line on Numo. Huge thank you to everyone who contributed and helped bring more real-world voice data into AI training. Submissions are now closed in these languages.
37
13
168
2,799,609
May 11
We’ve officially reached our target for Bengali voice data contributions. To everyone who participated in this category: ধন্যবাদ! Thank you for helping build the next generation of AI systems.
41
24
189
19,060
Poseidon retweeted
Week one Numo stats: ▸ 210,000 contributions ▸ 18,000 contributors ▸ Contributions growing ~40% day over day Real voices, real fast. Onward.
seven days ago we launched Numo with @psdnai. a billion people speak hindi, bengali, tamil, telugu. less than 0.1% of voice ai training data is in any of them (like in microsoft's VibeVoice latest model). nobody was fixing this big training gap, so we started collecting thousands of hours from contributors all over india. week one stats: > 210,000 contributions > 18,000 contributors > contributions growing about 40% day over day every contribution registered as IP. every contributor rewarded. every second of audio licensed at the moment it's create on story. Vietnamese launched just hours ago and we already have thousands of contributions! here's a some fun facts about why we are focusing on Vietnamese: Vietnamese has six tones across three regional dialect systems: Northern, Central, Southern, with big differences in tone systems, vocabulary, and pronunciation Central Vietnamese is the most different and it’s the most challenging dialect, even for other Vietnamese speakers Most voice AI is trained on Northern Vietnamese which misses roughly half of how the country actually speaks thanks to Numo's collection efforts AI models will be able to upskill on conversational nuances and be more realistic. we're also not stopping at voice. we'll start rolling out other forms of tasks soon, always focused on long tail, hard to scrape real world data the models can't easily scrape from the internet.
50
28
175
20,271