Shopify CTO

Joined March 2022
114 Photos and videos
OpenAI has removed 5.2-Pro from ChatGPT. The best model for math/ML is not available anymore. Yes, it was mostly through an unbelievably high reasoning budget, but still. A dark day - and I'm not being facetious :-(
9
3
88
9,797
Fable 5 is in the league of its own. Both in quality and price - already on Toloka Arena:
15
6
256
16,999
Well, Tibo, for a year now I was pleading, arguing for, begging you guys to bring Pro as an advisor model into Codex (really, allow for the LARGE thinking budget)…
Jun 10
I would like to claim my 1% of royalty fees.
15
1
355
68,493
I remember, I was there :-) Fun fact, it wasn’t NeurIPS, it was a room they rented in the same hotel, organizing an unrelated “workshop”, because NeurIPS wouldn’t accept them - Neural Networks weren’t cool. The room was packed, though.
Geoff Hinton, before he was Geoff Hinton, once asked NVIDIA for a free GPU for his students Alex Krizhevsky and Ilya Sutskever. Nvidia declined
7
6
122
14,222
Have been extensively testing Claude Workflows this weekend, with the best model possible. Threw it at my whole code base, combing for bugs. 144 found and fixed! Geez... It is a large code base, for sure, but 144?!! Some are very impactful, some are downright embarrassing...
I keep predicting software quality will improve. I keep being wrong. Models write better-than-average code, yet we use them to write more code - not better code (shoutout to the unmovable, always-on-top Claude Code download and install window).
45
9
543
177,687
I had exactly the same issue with FedEx and Mackage. $2200 stolen, both agreed it happened, yet refused to engage. Stopped buying from Mackage, of course.
Bought a $1,742.80 camera online from BestBuy. The FedEx delivery driver stole it. FedEx admitted it. But BestBuy won’t give a refund. They said we need to “work with local law enforcement.” Thought everyone should know if you buy from @BestBuy and a @FedEx driver steals what you paid for, your money is gone. Neither company will make it right. I’ve spent over $30K at BestBuy and will never spend another penny there.
11
12
193
19,898
We just published a paper on nitty-gritty technical SimGym details. Come and chat with us at ICML - DeepMind/Cornell/Stanford workshop. arxiv.org/pdf/2605.16116

2
1
51
5,668
And I was right! Toloka Arena finished testing - Claude 4.8 did take the first place. And, just as I saw, it did so through higher reasoning budgets - look at the number of tokens used.
OK, going to call it. Spent a lot of time with Opus 4.8: 1) It is a big step forward. The base model is still inferior to GPT-5.5, but they dramatically upped the thinking budget (for Max) - makes all the difference 2) Instruction following is still worse than GPT-5.5 xhigh 3) Coding, math, reasoning - better! It's not at the Pro level (of course), but the first Anthropic model I can genuinely use for math/ML. Codex app is much better (especially on Windows), but, until 5.6 arrives, I switched to Claude Code as the main system. Hearing great things about 5.6 though!
4
2
57
12,432
I feel like I just lost a family member - the new (today's) Codex version is broken (on Windows at least), trying to load the image generation model for some reason. The world stopped :-(
12
1
78
7,880
I’ve been watching this technology grow and develop from literally before the very first idea. And I am more and more amazed every day, not less. Codex is almost a family member now.
4
3
53
4,142
Pushed a small, but very useful addition to ml-tidbits today: a well-implemented GPU-friendly ranking loss function with no sorting, full gradient propagation. Nothing new, but so often I saw people using just subpar, nonsensical variants...
For 7 years I’ve been trying to satisfactorily solve an ML problem (deterministic Gaussian Autoencoder). Tried everything. Recently has finally solved it with 5.2 Pro Extended Thinking, planning to add it today to “ML tidbits” :-)
2
1
45
10,249
Greg is unbelievably intense and worked 80hr weeks for decades. Nothing would’ve happened without him. Besides, as Nassim Taleb would teach us, if someone keeps winning the lottery again and again - it’s not random.
Greg Brockman is the biggest lottery winner in history: he contributed nothing to ChatGPT and still made $30B from OpenAI.
32
59
1,892
209,347
Model training is a game where GPUs and data are an overwhelming advantage. So when Recraft beats xAI, DeepSeek, Meta, BFL, Microsoft, etc. with a tiny fraction of the resources, the conclusion is: big-company ML talent selection is broken. Very different from "AI experts" :-)
15
6
247
168,288
OK, going to call it. Spent a lot of time with Opus 4.8: 1) It is a big step forward. The base model is still inferior to GPT-5.5, but they dramatically upped the thinking budget (for Max) - makes all the difference 2) Instruction following is still worse than GPT-5.5 xhigh 3) Coding, math, reasoning - better! It's not at the Pro level (of course), but the first Anthropic model I can genuinely use for math/ML. Codex app is much better (especially on Windows), but, until 5.6 arrives, I switched to Claude Code as the main system. Hearing great things about 5.6 though!
41
30
832
104,735
It's like Christmas came early! Except still doesn't work :-(
🚀 Codex app 26.527 is out! 🖥 Computer Use on Windows 📱 Remote Windows control from iOS, Android & Mac 👤 Profile section with usage stats & token activity Changelog: developers.openai.com/codex/…
7
49
12,490
After a round of Authorization magic incantations it works now!
1
13
2,563
Recraft, my ex-colleagues, keep punching above their weight - #1 independent, #3 after OpenAI and Google.
Two weeks since launch, and the public benchmarks are in. Recraft is officially the #1 independent image generation lab 🚀
4
33
4,918
The fact that JAX was even mentioned makes me think of two things: 1) xAI still needs better ML people 2) PyTorch is stagnating and probably is not going to recover :-( It is such an unparalleled achievement, but lost key people, exiled to FAIR now...
SpaceX has almost finished writing V1.0 of an in-house AI training stack in C that exact-maps to 220k GB300s with 800G NICs, making heavy use of pipeline parallelism and getting as close to bare metal as possible. The potential speed improvement vs JAX for large training runs is over an order of magnitude.
30
9
632
190,632
IRIX deserves to be reborn - it was SO ahead of its time: a real file system (XFS - still alive!), NUMA done right, kernel support for DMA in graphics (I want that for GPUs on my desktop!) and the coolest of all, the real-time subsystem (REACT/Pro). WSL -> WSI :-)!!!
Now I can say that I've RDPed from IRIX to Windows.
2
17
5,778