Alex Shtoff

Alex Shtoff

504 Photos and videos

Tweets

Pinned Tweet

Alex Shtoff

@AlexShtf

Mar 17

New post in my "Eigenvalues as models" series. This one asks a practical question: can eigenvalue-based models be made much cheaper to train and evaluate without collapsing into something too simple to be interesting? Dense matrices are expressive but expensive. Fully diagonal ones are cheap but too restrictive. In this post I explore a middle ground that turned out to be much more useful than I expected. It is probably the most implementation-focused entry in the series so far: structured matrices, PyTorch/SciPy plumbing, and experiments. If you care about spectral methods, differentiable numerical linear algebra, or unusual tabular model classes, this is post and the entire series is for you: alexshtf.github.io/2026/03/1…

Cheaper eigenvalue training and inference

Cheaper eigenvalue training and inference with symmetric tridiagonal matrices: preserve useful expressiveness, use fast SciPy-backed PyTorch autograd, and avoid dense eigensolvers.

alexshtf.github.io

4,821

Alex Shtoff

Alex Shtoff

@AlexShtf

Jun 9

Does Mythos pass the test?

Alex Shtoff

@AlexShtf

Apr 24

Finally, we have AGI! GPT-5.5 got the joke about the heavy tail! Kudos to @OpenAI ;)

166

Alex Shtoff

Alex Shtoff

@AlexShtf

Jun 9

A truly phenomenal interview with the one and the only - Simon Peyton Jones.

Ryan Peterman

@ryanlpeterman

Jun 8

Simon Peyton Jones is the co-creator of Haskell (pure functional programming language) and I interviewed him about functional programming, why it matters, and his thoughts on other programming languages. In this episode: • Useful and useless programming languages • Rust vs C • Haskell vs OCaml • Why functional programming matters • Static languages and their value for LLMs • Why Excel is his 2nd favorite programming language Where to watch: • YouTube - youtu.be/xcB_LF3cdqw • Spotify - open.spotify.com/episode/5d9… • Apple Podcasts - podcasts.apple.com/us/podcas… • Transcript - developing.dev/p/co-creator-… Thank you to the sponsor of this episode for supporting my work: • WorkOS: makes your app Enterprise Ready with easy to use APIs to add SSO, SCIM, RBAC, and more in just a few lines of code, check them out at workos.com/ Chapters: 00:00 - Intro 00:39 - What functional programming is 09:18 - Downsides of functional programming 10:53 - Specialized hardware for functional programming 21:47 - Haskell is useless 25:59 - Rust vs C 28:26 - Haskell vs OCaml 35:26 - Side effects in Haskell 44:26 - Type systems 57:30 - How the Haskell compiler works 01:04:35 - Why Haskell is talked about more than used 01:09:07 - Avoiding success at all costs 01:11:12 - LLMs and programming languages 01:13:57 - New programming language design 01:15:59 - Should students continue to learn programming 01:22:33 - Why Excel is is 2nd favorite programming language 01:25:04 - Advice for his younger self

1:27:53

141

Alex Shtoff

Alex Shtoff

@AlexShtf

Jun 6

I agree. Lines of 𝙙𝙚𝙡𝙚𝙩𝙚𝙙 code.

roon

@tszzl

Jun 4

lines of code is a better metric than people think it is. token use is a better metric than people think it is

594

Alex Shtoff

Alex Shtoff

@AlexShtf

Jun 4

What's the story behind this?

Alex Shtoff

Alex Shtoff

@AlexShtf

May 29

"Come work with us. You'll have cookies, but your kids won't have a parent."

This tweet is unavailable

163

Alex Shtoff

Alex Shtoff

@AlexShtf

May 26

Yes

vik

@vikhyatk

May 25

too much time is being spent making optimizers marginally faster. what we really need is hparam-free optimizers

443

Alex Shtoff

Alex Shtoff retweeted

Alex Shtoff

@AlexShtf

May 11

טוב. אני רוצה להיות מבקר מדינה. איפה מוצאים 10 ח"כים להמליץ עלי?

437

Alex Shtoff

Alex Shtoff

@AlexShtf

May 17

I propose a life ban from arXiv when there is an argmin/min/argmax/max/expectation operator without saying over **what**, or when the "over what" variable doesn't appear in the operand expression.

Evgenii Egorov

@eeevgen

May 16

Inspired with arxiv discussions I propose to life ban when there are equations in paper which doesn’t match free indexes on the left and right parts of equations

861

90,747

Alex Shtoff

Alex Shtoff

@AlexShtf

May 14

PyTorch 2.12 features much faster CUDA Hermitian eigenvalue computation - up to 100x. In case you need it :) Haven't expected my issue report to get such a prompt response from the PyTorch team... Fix merged almost immediately after. And made it to the next release. Thank you pytorch team! pytorch.org/blog/pytorch-2-1…

383

Alex Shtoff

Alex Shtoff

@AlexShtf

May 7

Gradient descent generated this image, demonstrating, that apparently gradient descent does work :)

Gavin Brown

@gavinrbrown1

May 6

Gradient descent does not work. I will die on this hill.

260

Alex Shtoff

Alex Shtoff

@AlexShtf

May 3

Grok being aware its just an LLM, as an excuse for why it gave me the wrong answer :)

130

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 30

"2026 is the year of no more slop" -- Dexter Horthy Yes, and Zed is a hell of a landmark.

Zed

@zeddotdev

Apr 29

We've shipped more than a thousand versions of Zed, but all of them began with zero. Today, that changes. zed.dev/blog/zed-1-0

187

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 28

OK. Meanwhile I built this plugin that tries to mimic "Consult Pro", with the generous help of ChatGPT-Pro itself. github.com/alexshtf/deep-foc… But this is just a workaround. It's nice, but it's nothing compared to the quality of ChatGPT Pro's answers.

GitHub - alexshtf/deep-focus-plugin: Codex Deep Focus skill and custom-agent bundle

Codex Deep Focus skill and custom-agent bundle. Contribute to alexshtf/deep-focus-plugin development by creating an account on GitHub.

github.com

Alex Shtoff

@AlexShtf

Apr 25

Why can't we ask Codex to "consult Pro" when it's having trouble doing the task on its own? @thsottiaux github.com/openai/codex/issu…

153

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 28

Gemini says about itself - "it is architecturally one of the most bloated front-ends in existence". 😀

113

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 26

"Vector search is a computational geometry / numerical systems problem dressed as an AI product"

169

Alex Shtoff

Alex Shtoff retweeted

Alex Shtoff

@AlexShtf

Apr 26

Replying to @Yampeleg

Have you heard of model predictive control? Predates next-token prediction, works extremely well in practice, drives our world, from planes, to vehicles, to finance. A famous quote by Stephen Boyd: In MPC, you solve a full planning problem using forecasts as if they were perfect. That is “quite ridiculous,” because you do not really believe the planned future trajectory will happen. But you only apply the first input/action, then observe again and re-plan. It “looks dumb,” yet works shockingly well.

887

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 25

Why can't we ask Codex to "consult Pro" when it's having trouble doing the task on its own? @thsottiaux github.com/openai/codex/issu…

Consult ChatGPT Pro · Issue #19515 · openai/codex

What variant of Codex are you using? cli What feature would you like to see? A way to ask Codex to consult ChatGPT Pro for a better answer on a specific issue. It should be based on consent, since ...

github.com

313

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 24

Yeah!

Mathieu

@miniapeur

Apr 24

Research, but at my own pace, on topics that are truly of interest to me, and without the pressure to publish incremental work just for the sake of publishing.

114

Alex Shtoff

Alex Shtoff

@AlexShtf

Apr 24

Finally, we have AGI! GPT-5.5 got the joke about the heavy tail! Kudos to @OpenAI ;)

345