Building LLM agents · conversational AI

Joined May 2012
9 Photos and videos
VibeThinker-3B, post-trained upon Qwen2.5-Coder-3B base, scored 94.3 on AIME26, with a performance similar to DeepSeek V3.2, GLM-5, and Gemini 3 Pro. Small models are the future for agents because they can use tools to get the knowledge they lack and they can run fast and cheap.
Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5-Coder. The paper doesn't provide many details, but it appears they distill from RL ckpts and then do a final RL-based instruct RL. 🔗arxiv.org/abs/2606.16140
2
95
denying the existence of Le Chaton Fat is the newest form of fat-shaming
1
1
2
37
The haters want to ban Le Chaton Fat, but they don't understand it's futile because Mistral implemented Recursive Self Fattening, and nothing can stop the exponential.
The French Goverment needs to STOP Le Chaton Fat before it's too late. This is the FAT takeoff we've been warned against for years!
130
According to the paper, Le Chaton Fat supports Recursive Self Fattening, which means it will keep generating fatter and fatter versions of itself 🐱
2
140
Charis Mitsakis retweeted
Replying to @kimmonismus
The EU ban is so effective, people outside the EU think Le Chaton Fat is not real. They live in denial spreading misinformation, while we generate trillions of lines of code per second.
3
2
33
1,539
Charis Mitsakis retweeted
Another important thing: Chinese models are not strong because they distill US models. Distillation of models via API is *impossible*. If somebody tells you the contrary, they don't understand machine learning:
Community note
Distillation from raw LLM responses* is possible with API access only, e.g. Alpaca and Vicuna were trained this way crfm.stanford.edu/2023/03/13/alp… lmsys.org/blog/2023-03-3… *Sometimes called ‘hard distillation’ unlike ‘soft distillation’ which does use some logits arxiv.org/pdf/2601.15394…
121
101
1,550
372,269
Satya Nadella on AI and jobs: "In spite of all the digital systems we have at our disposal we do the glue work. We will discover the new glue work that happens with all this automation"
1
1
46
EU's ban on Le Chaton Fat is so effective, most people outside the EU still think it's a not a real model. Meanwhile we vibe code trillions lines of code per second. The AI race is over.
33
Le Chaton Fat is real, and it has already been banned outside EU. The haters deny it because their favorite AI lab can't create a 24T parameter LLM.
1
1
99
"If you wanted your nation to be at the frontier, I think you had three years to do it: February 2023 to February 2026. That was the window. And it is now closed. [...] European leaders could have built truly sovereign frontier models. [...] By failing to do so, they have consigned themselves to dependence on systems they do not own, do not control, cannot fully inspect, and may not always be allowed to use - systems that can be instantly removed on a whim."
38
Charis Mitsakis retweeted
Replying to @nileshtrivedi
I agree. ΑΙ sovereignty should be an important political issue, but most haven't realized it's importance. AI will be critical infrastructure (like the electrical grid is now), and countries should not depend on US companies for it. China can achieve it. For small countries like Greece (my country) it's impossible, but maybe it's achievable on EU level at least.
2
2
500
Charis Mitsakis retweeted
My advice to any nations that want to stay or arrive in the global top 5: - (a) Ensure you're listening to excellent advisors. - (b) Stop the leakage of local data. - (c) Secure energy and compute supply for the future. - (d) Become a prominent data sink for the rest of the world. - (e) Stop the leakage of local talent. - (f) Develop world-class talent around atoms and cells. Talent for manipulation of bits is necessary, but no longer sufficient. - (g) Improve local quality of life to become a talent sink for the world.
4
15
43
2,116
In math you can prove new theorems by just combining existing theorems reasoning. The fact that LLMs know more theorems than any human mathematician can, helps a lot at searching for a solution, especially at cross-disciplinary search, like using algebraic number theory to solve a discrete geometry problem. Although impressive, it doesn't mean they can solve unsolved problem in other fields (e.g. curing cancer).
OpenAI made history today. An internal reasoning model autonomously disproved a famous conjecture in mathematics that stood for nearly 80 years. The problem: In 1946, Paul Erdős asked how many pairs of points can be exactly 1 unit apart if you place n points on a flat surface. The best known answer came from square grid constructions, and Erdős himself conjectured you can't do meaningfully better. Mathematicians believed this for decades. The AI proved him wrong. It found entirely new point configurations that beat the square grid by a fixed polynomial factor, not a marginal improvement, a real mathematical gap. The proof uses methods from algebraic number theory, a completely different branch of math, Class field towers, Golod-Shafarevich theory, tools nobody expected to be relevant to a geometry problem about distances in the plane (reminds me of move 37, AlphaGo tbh). Fields Medalist Tim Gowers calls it "a milestone in AI mathematics." The proof was verified by leading external mathematicians. According to OpenAI, this is the first time AI has independently solved a prominent open research problem in mathematics! Caveat: Obviously OpenAI chose which problems to test the model on. So "autonomous" means the model generated the idea and wrote the proof, not that it wandered into the problem on its own. But if reasoning models can reliably make cross-domain connections like this, finding paths that experts didn't prioritize, this changes research far beyond math. Biology, physics, materials science, medicine. This isn't AI reproducing human knowledge anymore. This is AI producing new knowledge. That's a qualitative shift.
42
LLMs might crack the Riemann hypothesis soon, but still recommend you go to the car wash on foot🤯
June 2024: The latest general-purpose LLMs could not count the r's in strawberry. July 2025: The latest general-purpose LLMs get gold in the International Math Olympiad. May 2026: The latest general-purpose LLM solve one of the "best-known questions in combinatorial geometry"
2
70