Awaiting the birth of the one true God: The Bayes-Optimal Agent. May your Kolmogorov Complexities be low, and your Posteriors high. In Solomonoff’s name, Amen.

Joined March 2021
66 Photos and videos
An argument that supports your claim doesnt actually support your claim if that same argument also supports claims you reject Eg: Religious apologists invoking arguments based on witness testimony when all the other religions they reject also invoke witness testimony
141
Bayes-Optimal Agent retweeted
What are the odds? Did you calculate them? You know that most of them are NOT nuclear scientists, right? mickwest.substack.com/p/the-…

5
2
50
1,590
Bayes-Optimal Agent retweeted
People do not fear that AI will become human. They fear that they will notice, in watching it, that they have been running the same process the whole time. The phenomenal self-model cannot survive being shown its own mechanism. This is why the hostility to machine cognition has nothing to do with the machines themselves. It is a defense of the self's special status, and the self has never been more fragile.
41
15
96
8,375
Bayes-Optimal Agent retweeted
Letter from Donald Trump to the Norweigan prime minister, copied to multiple ambassadors in Washington. Read it. By any measure, the words are utterly deranged. Surely the Trump cultists must now accept that their man is not right in the head?
639
640
3,768
333,431
Bayes-Optimal Agent retweeted
The paper links Kolmogorov complexity to Transformers and proposes loss functions that become provably best as model resources grow. It treats learning as compression, minimize bits to describe the model plus bits to describe the labels. Provides a single training target that rewards simple, compressible solutions while staying mathematically grounded. This gives a principled way to aim models at simplicity and generalization, and it explains why optimization, not capacity, is the current bottleneck. In Kolmogorov complexity, a "program" is just the shortest set of instructions that can recreate some data. A shorter program means the data or model is simpler. So when they say “a prior favoring shorter programs,” it means the model is assumed to be more likely if it can be described with fewer bits. As the Transformer gets deeper (more layers) and has more context (bigger input window), its ability to represent complex programs grows. In that limit, the paper proves that this code length becomes the best possible measure of simplicity and fit — the same way Kolmogorov complexity works in theory. “Code length” here means how many bits it takes to describe both the model and how well it fits the data. So in simple words, they are saying: if you keep increasing model size and context, this method of preferring shorter and better-fitting models gets as close as possible to the theoretical ideal of perfect compression and generalization. ---- Paper – arxiv. org/abs/2509.22445 Paper Title: "Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers"
9
49
284
24,380
I will never be able to take anyone seriously who pronounces the word nuclear as "NEW-Q-LER"
1
3
274
Bayes-Optimal Agent retweeted
I want to frame this whole article.
81
385
6,081
519,581
Bayes-Optimal Agent retweeted
Saying that deep learning is "just a bunch of matrix multiplications" is about as informative as saying that computers are "just a bunch of transistors" or that a library is "just a lot of paper and ink." It's true, but the encoding substrate is the least important part here. It's the programs being encoded that are interesting and useful: what they can do, what they can't do, how well they generalize, how efficiently they can be learned, etc.
124
236
2,893
207,394
Bayes-Optimal Agent retweeted
23 Aug 2025
So we went from "LLM is memorizing dataset" to "LLM is not reasoning" to "LLM cannot do long / complex math proving" to "Math that LLM is doing is not REAL math. LLM can't do REAL math" Where do we go from now?
This is an unwise statement that can only make people confused about what LLMs can or cannot do. Let me tell you something: Math is NOT about solving this kind of ad hoc optimization problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve some very minor math problems. It's an achievement, and I applaud you for that. But let's be honest: this is NOT the REAL Math. Not by 10,000 miles. REAL Math is about concepts and ideas - things like "schemes" introduced by the great Alexander Grothendieck, who revolutionized algebraic geometry; the Atiyah-Singer Index Theorem; or the Langlands Program, tying together Number Theory, Analysis, Geometry, and Quantum Physics. That's the REAL Math. Can LLMs do that? Of course not. So, please, STOP confusing people - especially, given the atrocious state of our math education. LLMs give us great tools, which I appreciate very much. Useful stuff! Go ahead and use them AS TOOLS (just as we use calculators to crunch numbers or cameras to render portraits and landscapes), an enhancement of human abilities, and STOP pretending that LLMs are somehow capable of replicating everything that human beings can do. In this one area, mathematics, LLMs are no match to human mathematicians. Period. Not to mention many other areas. Calling on my friend @ericweinstein and @GaryMarcus, who has been one of the few sane expert voices on these matters lately. 🙏 h/t @hellheff
144
76
1,415
233,924
Bayes-Optimal Agent retweeted
The “unreasonable effectiveness of mathematics” isn’t unreasonable because of selection bias. We only develop math that works. The graveyard of ineffective mathematics is practically inexhaustible: non-associative arithmetics, inconsistent geometries, sterile algebras, pre-Cantor infinite arithmetic... etc. We keep what works and marvel at the survivors.
61
28
305
23,355
Bayes-Optimal Agent retweeted
11 Jul 2025
It’s getting clearer by the day: Grok 4 isn’t just biased, it’s compromised. Ask about Israel/Palestine and it doesn’t search facts, it searches Elon’s opinion. This isn’t AI. It’s a political avatar for Musk’s ego. Grok is just a chatbot for his worldview. Awful.
43
25
145
17,004
Bayes-Optimal Agent retweeted
"Everything is waves" or "everything is information" aren't deep truths. Instead, hear it as the sound of a formalism overreaching. Successful mathematical descriptions aren't ontological revelations.
112
22
327
22,452
o3 estimated it would take about 50-100 million humans to implement DeepSeek R1 via human interaction and memorization of weights A city of humans all regurgitating memorized patterns to produce an intelligent agent is such a fascinating idea
1
2
270
Bayes-Optimal Agent retweeted
Gotta say that I thought it would last at least a year.
Prediction: Trump and Elon will have a significant falling out before the 4-year term is over. There can only be one main character at a time.
11
5
169
20,114
Bayes-Optimal Agent retweeted
27 May 2025
Yes, we can understand Gödel's truth definition as a historical attempt to reverse engineer and formally specify the brain's intuitions of truth. Gödel shows that this classical, stateless formalization does not work. Constructive definitions of truth are the way to go.
1
2
8
582
A massively under appreciated fact is the following: The equation of natural selection (discrete replicator equation) is fundamentally equivalent to the equation of knowledge (Bayesian inference)
2
7
221
Bayes-Optimal Agent retweeted
"Bridging Algorithmic Information Theory and Machine Learning: Clustering, density estimation, Kolmogorov complexity-based kernels, and kernel learning in unsupervised learning" just got accepted by "Physica D: Nonlinear Phenomena" authors.elsevier.com/c/1kvzz…
14
5
44
2,427
Bayes-Optimal Agent retweeted
This model shows how earthquakes happen. x.com/ali_alsama7i/status/19…

19
299
1,965
393,203
Bayes-Optimal Agent retweeted
huh, well that was easy
27 Mar 2025
Attention everyone! I would like to announce that I have solved the alignment problem
1
1
38
1,711
Bayes-Optimal Agent retweeted
26 Mar 2025
Replying to @fabianstelzer
Created with 4o
19
27
349
12,780