co-founder @wafer_ai -- fastest llm inference

Joined December 2015
48 Photos and videos
Jun 16
good luck to p26! <3
Jun 16
today is yc demo day. just about a year ago, @gpuemi and i stepped onto that stage and presented wafer (f.k.a. herdora). what felt like the end of a chaotic batch turned out to be the beginning of everything that mattered. for everyone presenting today: enjoy the moment, celebrate how far you've come, and take the photos. wishing u the best, p26♥️
1
8
3,145
emi retweeted
Jun 16
today is yc demo day. just about a year ago, @gpuemi and i stepped onto that stage and presented wafer (f.k.a. herdora). what felt like the end of a chaotic batch turned out to be the beginning of everything that mattered. for everyone presenting today: enjoy the moment, celebrate how far you've come, and take the photos. wishing u the best, p26♥️
42
3
159
52,165
emi retweeted
they’re not jobs if they’re not valued. they’re not valued if there aren’t customers out there willing to pay them for their great work. needing the government to “create” a job is tantamount to welfare and that level of welfare resolves these individuals to a dependency on the government and lack of economic mobility. and chains our people, collectively, to a more indentured future. you may be well intentioned but you have, and always will, fail to see the destitute folly of government as a job creation engine. i have tried to engage you on this topic, in good faith, with empiricism and reasoning, but you have only dodged my points and pivoted to some populist refrain about the importance of taxation and the evils of productivity-driven success. i can only assume you’re dodging these truths because you and the rest of the politburo leadership have deemed the conversation unsafe speech and put your oligopoly at risk. let’s leave it at that then. perhaps if your ways get their day, we can all bask in the glories of the dark ages ahead.
336
951
12,022
457,383
emi retweeted
Jun 12
History's first trillionaire is a guy who catches rockets out of the sky with chopsticks and beams internet to every dead zone on the planet. Same guy ships cars that drive themselves, humanoid robots for the factory floor, brain chips that let paralyzed people move a cursor with pure thought, and an AI running on a supercomputer his team stood up in months instead of years. And the people crashing out about his net worth are doing it on the app he owns. The same app governments spent years trying to censor. You cannot legislate a rocket into orbit.
1,905
11,600
71,129
2,406,735
emi retweeted
Jun 11
wafer recently partnered with @digitalocean to 10X THEIR INFERENCE ON AMD. read below to hear about how we wrote custom kernels, trained a spec decode model, and more to achieve PEAK performance! 🧇
8
2
75
4,973
emi retweeted
FAANG? naaaaah it's WANGO
7
2
35
2,391
emi retweeted
wafer is now available on respan gateway. teams can now discover and build with wafer directly through @RespanAI. we serve glm 5.1, kimi k2.6, and qwen3.5-397b with what we believe is the best speed-to-price ratio on the market. excited to make wafer more accessible to teams building leading ai products. learn more: respan.ai/ai-gateway
3
16
1,130
emi retweeted
excited to launch eu-only endpoints with zero data retention in collaboration with orq.ai, the sovereign ai platform. teams in europe want frontier open weight models while staying gdpr compliant. before anyone asks about benchmarks, they ask where the model actually runs and what happens to their data afterward. built with the wonderful orq.ai team <3 pass.wafer.ai
3
15
1,684
emi retweeted
this weekend only: every serverless credit purchase is doubled through sunday 11:59 pm. add $20 → get $40 to spend add $100 → get $200 to spend use the bonus credits to try glm 5.1 - delivered at 150-250 tok/s, among the fastest speeds available, while remaining 30% cheaper than the next closest competitor. ends sunday at 11:59 pm: app.wafer.ai
3
2
12
851
emi retweeted
Excited to announce Slashy The first email client that works for you. The real cost of email isn't the time. It's the mental load of constantly checking it, just in case something needs you. Slashy kills that. You never need to open your inbox unless Slashy tells you. Try it out at slashy.com
80
34
264
58,046
emi retweeted
New: A group of AI researchers from Google DeepMind, Apple, MSL, and OpenAI are launching a new startup called Trajectory to build a continual learning platform for companies. They've raised a $15M seed round from Sarah Guo's Conviction, Jeff Dean, Fei-Fei Li, and others.
11
15
209
32,551
emi retweeted
Can't we do better? First short film from @fiftyyears.
25
33
155
59,125
emi retweeted
Replying to @garrytan
That's wafer.ai if you want to try it.

3
3
31
6,171
emi retweeted
May 26
yes, garry tan uses wafer 🪩
This is a killer stack I just started using Wafer to serve my qwen3.6-27b custom fine tuned llm and it's excellent
3
4
31
4,564
emi retweeted
This is a killer stack I just started using Wafer to serve my qwen3.6-27b custom fine tuned llm and it's excellent
Replying to @jsawadd
Potential stack of something like: Hermes from @NousResearch @joinmassive from @jsongrad (web search and more 👀) Gbrain from @garrytan (second brain) @obsdmd (multi-purpose) @ZeroEntropy_AI from @ghita__ha (specialized models) Wafer from @gpuemi & @gpusteve for open source Inference? Delegation ability to Claude Code/Codex ^ some interchangeable, some can be consolidated
24
31
493
105,233
emi retweeted
🎉 Excited to welcome @wafer_ai to Infron as a provider. Wafer does what used to take a team of world-class performance engineers automatically. Their AI agents optimize GPU inference across any hardware, finding the configurations that matter. Infron is a unified AI gateway: 400 models, 100 providers, one API key, zero markup. ⚡Cheap intelligence is the most essential technology for the future. Wafer and Infron are making that real. Wafer-optimized Qwen3.6-35B-A3B is now available on Infron: 🔗 infron.ai/models/qwen/qwen3.… #AIInfrastructure #AIGateway #LLMOps #Infron #Wafer
1
12
3,635
emi retweeted
New blackboard lecture w @reinerpope How do chips actually work – starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 0:00:00 – Building a multiply-accumulate from logic gates 0:16:20 – Muxes and the cost of data movement 0:25:59 – How systolic arrays work 0:39:00 – Clock cycles and pipeline registers 0:51:40 – FPGAs vs ASICs 1:03:14 – Cache vs scratchpad 1:07:16 – Why CPU cores are much bigger than GPU cores 1:11:49 – Brains vs chips 1:15:22 – A GPU is just a bunch of tiny TPUs Look up Dwarkesh Podcast on YouTube/Spotify/etc to watch. Enjoy!
94
724
5,598
927,489
May 22
holy fuck god bless dwarkesh
New blackboard lecture w @reinerpope How do chips actually work – starting with basic logic gates, and working up to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 0:00:00 – Building a multiply-accumulate from logic gates 0:16:20 – Muxes and the cost of data movement 0:25:59 – How systolic arrays work 0:39:00 – Clock cycles and pipeline registers 0:51:40 – FPGAs vs ASICs 1:03:14 – Cache vs scratchpad 1:07:16 – Why CPU cores are much bigger than GPU cores 1:11:49 – Brains vs chips 1:15:22 – A GPU is just a bunch of tiny TPUs Look up Dwarkesh Podcast on YouTube/Spotify/etc to watch. Enjoy!
4
1
15
4,378
emi retweeted
May 20
we recently optimized qwen3.5-397b-a17b to be the fastest deployment publicly hosted. and the crazy thing: we did it by writing CUSTOM KERNELS for AMD MI355x. 🍿 see our post below outlining how we optimized kernels to achieve SOTA performance.
7
11
108
8,159
emi retweeted
You have to read this one. We just published a recap into how @wafer_ai pushed @AMD inference performance to a level that’s getting the entire ecosystem’s attention and the results are kind of wild. What makes this story interesting isn’t just the performance itself. It’s how they achieved it: systems-level optimization, smart inference tuning, and a belief that AMD can compete at the very highest tier. Proud this work was powered on TensorWave’s AMD-native cloud infrastructure and early #MI355X deployments. tensorwave.com/blog/wafer-re…
4
6
41
4,719