Building @zml_ai (and we're hiring), ex @zenly, ex Exalead, ex @google. Skydiver and wingsuiter.

Joined October 2010
3,058 Photos and videos
Pinned Tweet
4 Jul 2020
6
2
39
Does anyone know folks at @dMatrix_AI ? We have 8 platforms running at full speed and 4 in the pipe. I'd like to make it 5.
7
1,084
Want a 38-core Intel Xeon 6 PCIe card DPU with 2x 100G ports and a lot of acceleration? We saw one at Computex 2026 servethehome.com/this-is-an-…
8
1,448
Incredible
USA. A Mexican restaurant. We had not yet ordered anything, and the food was already arriving. Chips. Salsa. Unrequested. Free. I stopped the waiter. "We have not earned these." "They just come with the table, man." They come with the TABLE. In my land, hospitality is a debt. Every gift creates an obligation, weighed carefully, returned in the proper season with interest of feeling. Here, the gift arrives before you have even proven you can pay for dinner. This is not an appetizer. This is a declaration: we trust you. Eat. I ate with the gravity the moment deserved. And then — I must report this calmly — the basket emptied, and a new one appeared. "Did we…?" "Refill," the waiter said. "It's bottomless." Bottomless. They have wells of salsa. The supply lines of this nation are beyond anything my ancestors imagined. My friend warned me. "Don't fill up on chips, dude." Too late. I had accepted three baskets. Honor demanded each one be finished — an unfinished gift is an insult. By the time my actual food arrived, I was a ruined man. I was not hungry. I was not comfortable. I had been defeated by a courtesy. Generosity that arrives before the request cannot be repaid. It can only be survived. I know the rule now. I have made my peace with the basket. One basket. Two at the most. Who am I deceiving. There is no number of baskets I would refuse. The trust of a nation is in that salsa, and I intend to honor all of it.
734
> They’re iterating insanely fast Glad Grok noticed
17h
**agentic_austin** It's steeve's zml/llmd from zml_ai: a Zig-based LLM serving engine (continuous batching, paged attn, etc.) compiled straight to hardware. This clip shows their new full Metal backend (Apple GPUs) handling 8 concurrent requests at true bf16. Goal = portable peak perf across NVIDIA/AMD/TPU/Metal without Python or CUDA lock-in. They're iterating insanely fast.
28
3,590
those mad men goddamn did it
Great showcase of the superpower that comes with using @bazelbuild and hermetic-llvm. Using remote executors and distributed Thin LTO to dramatically reduce the link time of large binaries such as LLVM. 1 min link with distributed Thin LTO. 15 min without Links below ⬇️
9
2,422
👏👏👏👏 how could this possibly backfire in any way
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
2
21
1,004
aaaaaand we're faster (i know i know)
After 5 days of work, we are now within 10% of llama.cpp (64 tok/s vs 70 tok/s) More work to do but momentum is great.
11
9
187
44,098

another 5 days later, zml/llmd runs fully on Metal, serving 8 simultaneous requests at full bf16 zml/llmd is our LLM serving software that all the modern niceties (continuous batching etc..) ps: "full bf16", what a time to be alive
5
675
another 5 days later, zml/llmd runs fully on Metal, serving 8 simultaneous requests at full bf16 zml/llmd is our LLM serving software that all the modern niceties (continuous batching etc..) ps: "full bf16", what a time to be alive
aaaaaand we're faster (i know i know)
6
16
202
25,419
another reason why zml-smi shows link speed (GPU was idle)
>launch a GPU / CPU memory inference run on rented 2x RTX Pro 6000 GPUs for DeepSeek V4 Flash >wonder why it performs abysmally >check PCIe devices, discover GPU1 is on a downgraded PCIe link oh the horrors of neoclouds
1
17
2,610
The great JB Kempf from VLC @videolan is starting a new venture: Kyber
Very excited to share that @lightspeedvp has led Kyber's $5M seed round, alongside OVNI Capital and Kima Ventures. This one was a no-brainer. Jean-Baptiste Kempf has spent his career building the kind of infrastructure the internet quietly relies on. @videolan. @FFmpeg. Real-time video, streaming and low-latency systems at massive scale. It's a rare kind of founder-market fit! With Kyber, they're now building the real-time control layer for physical AI. The infrastructure that lets humans and AI agents control robots, drones and autonomous machines from anywhere with virtually no delay. Think defense, healthcare, robotics, industry, applications are genuinely vast. The product-market-fit is just a strong as the founder-market fit. At the early-stages, founders usually need the most help with hiring and commercial introductions. With Kyber, neither feels like the bottleneck despite having been in stealth until now. Proud to back JB and the Kyber team. Onwards!
1
24
1,641
Steeve Morin retweeted
One top military officer provided a plausible explanation, behind closed doors on Capitol Hill, The Intercept has learned. In the briefing, a high-ranking officer on the Pentagon’s Joint Staff stated that some of the people killed by the U.S. military may have been the victims of human trafficking.
190
2,948
5,632
573,710
Steeve Morin retweeted
What an incredible evening at Neon Noir. celebrating the fundraising round of Kyber! 🥂 #party #streaming #oss #friendship
1
4
410
If I won the lottery, I wouldn’t tell anyone, but there would be signs. (VSORA Jotunn 8, 288GB HBM3e, TSMC 5nm process)
2
1
70
2,218
brb intelmaxxing b70
52
31
808
45,552
this is a good way to vaccinate users against closed weights
Jun 9
aaaand there goes my fable access because my code stores auth creds
1
3
73
4,274
Steeve Morin retweeted
New footage obtained by B’Tselem uncovers the moments when the Abu Haikal family was shot. Seven-month-old Sam Abu Haikal was killed in the shooting, and both his parents were injured. The footage clearly shows that the Israeli soldier fired at the car as it was slowing to a stop. The car was far from the soldiers and posed no danger to them whatsoever. Moments later, in another video obtained by B’Tselem, seven-month-old Sam’s father, Fahed, is seen just after his son was shot. Fahed is holding baby Sam in his arms, trying to stop the bleeding from his head with his hands, while Sam’s mother, Daniyah, who was also injured by the gunfire while holding her son, is seen sitting on the ground, next to the car. Last Friday, 5 June, an Israeli soldier fired at a Palestinian family driving home from a family visit, as they sat in their car in the Tel Rumeidah neighborhood in Hebron. The family was shot as the car was slowing to a stop at the soldier’s command. Sam, a seven‑month‑old baby who was in his mother’s arms in the back seat, was struck in the head and pronounced dead shortly afterward. Sam’s parents were also injured by the gunfire; his mother is still in the hospital. After the shooting, the soldier who fired and another soldier who was with him left the scene without checking the car or offering any assistance to the critically wounded baby or to his mother. In the past two and a half years, Israel has killed tens of thousands of children in Gaza and the West Bank. The immunity it gets from the international community has led to a reality where, under Israeli rule, Palestinian lives are entirely disposable – even a seven‑month‑old baby.
378
5,236
6,626
1,471,182
this is bullshit from anthropic
mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community also the fact that this is un purpose not visible to the user is crazy
1
19
1,113
if you’re sleeping on this, you’re missing out BIG time
Porting all versions of libstdc from source using @bazelbuild since GCC 8 to 17 =>✅ This closes the loop of Linux binary portability. And re-opens the door to CUDA hermetic compilation since they want libstdc headers.
10
2,513
I got to meet him and he’s super chill. Amarisoft basically prints money, btw.
A French engineer who lives quietly in Paris has spent 30 years writing software that the entire internet now runs on without knowing his name. He wrote the code that streams every YouTube video, every Netflix show, every TikTok clip. He wrote the code that runs the virtual servers underneath AWS, Google Cloud, and Microsoft Azure. He calculated more digits of pi than anyone in history. He has no Twitter. He has no marketing. He just keeps shipping. His name is Fabrice Bellard. Here is the story, because almost nobody outside the systems programming world knows what one man has built. Fabrice was born in 1972 in Grenoble, France. He studied at École Polytechnique, the top French engineering school. He never went to Silicon Valley. He never built a startup empire. He just wrote code. In 2000 he started a project called FFmpeg, an open-source multimedia framework for encoding, decoding, and streaming video. He was 28. The project did one thing nobody else had done well. It handled every video and audio format that existed, in one library, on every operating system. He led it himself for years. Today FFmpeg is the invisible engine of the internet. YouTube uses it. Netflix uses it. VLC uses it. Chrome and Firefox use parts of it. Every Android phone, every iPhone, every smart TV, every video editing tool you have ever touched runs FFmpeg somewhere underneath. If you have watched a video on a screen in the last 20 years, Fabrice's code processed it. He was not done. In 2003 he started QEMU, a machine emulator and virtualizer. He wrote it solo until version 0.7.1 in 2005. QEMU lets you run any operating system on any other operating system. It became the foundation of modern virtualization. KVM, the Linux kernel hypervisor, runs on top of QEMU. Every major cloud provider, AWS, Google Cloud, Microsoft Azure, IBM Cloud, runs virtual machines on infrastructure built around it. The Quick Emulator is the most cited piece of cloud infrastructure code on Earth. He kept going. In 2001 he won the International Obfuscated C Code Contest with a small C compiler that grew into TCC, the Tiny C Compiler. TCC can compile and boot a Linux kernel from source in under 15 seconds. In 2004 he calculated the most digits of pi ever computed at the time, using a personal desktop computer and an algorithm he derived himself called Bellard's formula. In 2011 he wrote a complete PC emulator in pure JavaScript that runs Linux in your browser, a project called JSLinux that engineers still cannot believe is real. In 2019 he released QuickJS, a small but complete JavaScript engine that fits where V8 cannot. In 2021 he released NNCP, a neural network based lossless data compressor that immediately took the lead on the Large Text Compression Benchmark. Then he turned his attention to large language models. He built TextSynth Server, a web server with a REST API for running LLMs locally. He released ts_zip and ts_sms, compression utilities that use language models to compress text and short messages at ratios traditional algorithms cannot reach. He released TSAC, a very low bitrate audio compression system. In December 2025 he released Micro QuickJS, a new JavaScript engine for microcontrollers, separate from QuickJS, designed for environments with almost no memory. Fabrice co-founded a telecom company called Amarisoft in 2012, where he serves as CTO. Amarisoft builds 4G and 5G base station software used by carriers and labs around the world. He has been running it for over a decade while continuing to ship personal projects from his own home page at bellard dot org He has no Twitter. He has no Instagram. He gives almost no interviews. His personal website is a flat list of projects with no styling, no fonts, no marketing copy. Just titles and links. A quiet French engineer who never moved to Silicon Valley wrote the code that quietly runs the internet. He is still shipping.
6
4
421
32,150
After 5 days of work, we are now within 10% of llama.cpp (64 tok/s vs 70 tok/s) More work to do but momentum is great.
well, since we now have native Zig Metal with matching performance, should we add Metal support to ZML ?
5
4
64
23,466

aaaaaand we're faster (i know i know)
5
915