Joined August 2012
504 Photos and videos
Pinned Tweet
New post in my "Eigenvalues as models" series. This one asks a practical question: can eigenvalue-based models be made much cheaper to train and evaluate without collapsing into something too simple to be interesting? Dense matrices are expressive but expensive. Fully diagonal ones are cheap but too restrictive. In this post I explore a middle ground that turned out to be much more useful than I expected. It is probably the most implementation-focused entry in the series so far: structured matrices, PyTorch/SciPy plumbing, and experiments. If you care about spectral methods, differentiable numerical linear algebra, or unusual tabular model classes, this is post and the entire series is for you: alexshtf.github.io/2026/03/1…
1
11
54
4,821
Does Mythos pass the test?
Finally, we have AGI! GPT-5.5 got the joke about the heavy tail! Kudos to @OpenAI ;)
1
166
A truly phenomenal interview with the one and the only - Simon Peyton Jones.
Simon Peyton Jones is the co-creator of Haskell (pure functional programming language) and I interviewed him about functional programming, why it matters, and his thoughts on other programming languages. In this episode: • Useful and useless programming languages • Rust vs C • Haskell vs OCaml • Why functional programming matters • Static languages and their value for LLMs • Why Excel is his 2nd favorite programming language Where to watch: • YouTube - youtu.be/xcB_LF3cdqw • Spotify - open.spotify.com/episode/5d9… • Apple Podcasts - podcasts.apple.com/us/podcas… • Transcript - developing.dev/p/co-creator-… Thank you to the sponsor of this episode for supporting my work: • WorkOS: makes your app Enterprise Ready with easy to use APIs to add SSO, SCIM, RBAC, and more in just a few lines of code, check them out at workos.com/ Chapters: 00:00 - Intro 00:39 - What functional programming is 09:18 - Downsides of functional programming 10:53 - Specialized hardware for functional programming 21:47 - Haskell is useless 25:59 - Rust vs C 28:26 - Haskell vs OCaml 35:26 - Side effects in Haskell 44:26 - Type systems 57:30 - How the Haskell compiler works 01:04:35 - Why Haskell is talked about more than used 01:09:07 - Avoiding success at all costs 01:11:12 - LLMs and programming languages 01:13:57 - New programming language design 01:15:59 - Should students continue to learn programming 01:22:33 - Why Excel is is 2nd favorite programming language 01:25:04 - Advice for his younger self
3
141
I agree. Lines of 𝙙𝙚𝙡𝙚𝙩𝙚𝙙 code.
Jun 4
lines of code is a better metric than people think it is. token use is a better metric than people think it is
9
594
What's the story behind this?
68
"Come work with us. You'll have cookies, but your kids won't have a parent."
3
163
Yes
May 25
too much time is being spent making optimizers marginally faster. what we really need is hparam-free optimizers
3
443
Alex Shtoff retweeted
טוב. אני רוצה להיות מבקר מדינה. איפה מוצאים 10 ח"כים להמליץ עלי?
1
437
I propose a life ban from arXiv when there is an argmin/min/argmax/max/expectation operator without saying over **what**, or when the "over what" variable doesn't appear in the operand expression.
Inspired with arxiv discussions I propose to life ban when there are equations in paper which doesn’t match free indexes on the left and right parts of equations
12
62
861
90,747
PyTorch 2.12 features much faster CUDA Hermitian eigenvalue computation - up to 100x. In case you need it :) Haven't expected my issue report to get such a prompt response from the PyTorch team... Fix merged almost immediately after. And made it to the next release. Thank you pytorch team! pytorch.org/blog/pytorch-2-1…

6
383
Gradient descent generated this image, demonstrating, that apparently gradient descent does work :)
Gradient descent does not work. I will die on this hill.
260
Grok being aware its just an LLM, as an excuse for why it gave me the wrong answer :)
130
"2026 is the year of no more slop" -- Dexter Horthy Yes, and Zed is a hell of a landmark.
Apr 29
We've shipped more than a thousand versions of Zed, but all of them began with zero. Today, that changes. zed.dev/blog/zed-1-0
187
OK. Meanwhile I built this plugin that tries to mimic "Consult Pro", with the generous help of ChatGPT-Pro itself. github.com/alexshtf/deep-foc… But this is just a workaround. It's nice, but it's nothing compared to the quality of ChatGPT Pro's answers.
Why can't we ask Codex to "consult Pro" when it's having trouble doing the task on its own? @thsottiaux github.com/openai/codex/issu…
153
Gemini says about itself - "it is architecturally one of the most bloated front-ends in existence". 😀
113
"Vector search is a computational geometry / numerical systems problem dressed as an AI product"
169
Alex Shtoff retweeted
Replying to @Yampeleg
Have you heard of model predictive control? Predates next-token prediction, works extremely well in practice, drives our world, from planes, to vehicles, to finance. A famous quote by Stephen Boyd: In MPC, you solve a full planning problem using forecasts as if they were perfect. That is “quite ridiculous,” because you do not really believe the planned future trajectory will happen. But you only apply the first input/action, then observe again and re-plan. It “looks dumb,” yet works shockingly well.
1
5
887
Yeah!
Research, but at my own pace, on topics that are truly of interest to me, and without the pressure to publish incremental work just for the sake of publishing.
1
114
Finally, we have AGI! GPT-5.5 got the joke about the heavy tail! Kudos to @OpenAI ;)
345