Most "private AI" claims are policies. They depend on trusting the vendor.
We just shipped something different. Our AI meeting app,
@HedyAI_, can now run the entire AI pipeline on your own device. Summaries, notes, chat, live coaching. Nothing leaves your device.
The demo was recorded with Wi-Fi turned off the entire time. The transcript, summary, and live chat were all generated locally on an M4 Max.
Qwen 3.6, Qwen 3.5, and Gemma 4 in the curated model lineup (quants by
@UnslothAI), ranging from 2B for newer iPhones up to 35B for users with serious hardware.
Plus: bring your own model from Hugging Face if you don't trust our curation.
Cloud is still the default. It's faster and produces higher quality output for many users. But that will change over time.
Local AI is opt-in. Built for the meetings that shouldn't happen on cloud tools: privileged client conversations, sensitive interviews, medical appointments, work done offline.
No silent cloud fallback. If local fails for any reason, the app errors out. It does not quietly retry against our servers. You opted into local for a reason, and a quiet retry would defeat the point.
The next few years of AI will be defined by a quiet shift. From a world where a handful of companies operate the AI on your behalf, to one where you can run your own pipeline, on your own device, with your own data, end to end.
Full write up:
hedy.ai/post/local-ai-engine…
@stevibe @bnjmn_marie ping me if you want to test this with your workflows.