Antonio Montano ☼

Antonio Montano ☼

530 Photos and videos

Tweets

Pinned Tweet

Antonio Montano ☼@AntoMon

26 Dec 2025

Beyond De-Skilling: Intelligence Explosion and the End of Skill as a Stable Category 4m4.it/posts/beyond-de-skill…

Beyond De-Skilling: Intelligence Explosion and the End of Skill as a Stable Category – Random Bits...

A critical commentary on The Atlantic’s The Age of De-Skilling, arguing that the article underestimates the paradigm shift introduced by accelerating artificial intelligence. Rather than a story of...

4m4.it

432

Yu Kanazawa

Antonio Montano ☼ retweeted

Yu Kanazawa @knzw783

18h

Brandt, S. (2026). Theory of mind and language development. Cambridge University Press. doi.org/10.1017/978100949629…

Theory of Mind and Language Development

Cambridge Core - Developmental Psychology - Theory of Mind and Language Development

cambridge.org

752

ninad

Antonio Montano ☼ retweeted

ninad

@ninaddaithankar

13h

Can a vision model learn to see with no augmentations, no masking, no cropping, no reconstruction? 🎬 It can! Introducing Temporal Difference in Vision (TDV), a new visual representation learning paradigm built on a single assumption: the past causes the future. TL;DR : - We introduce TDV, the first approach to learn useful representations without any augmentations, masking, cropping or pixel based reconstruction. - TDV matches SOTA recipes like DINO and iBOT on dense spatial tasks - We also show that as data scales up, weaker assumptions work better. 🧵Thread:

196

8,621

fly51fly

Antonio Montano ☼ retweeted

fly51fly @fly51fly

[LG] How Post-Training Shapes Biological Reasoning Models L Fesser, H Zhang, M M. Li, E Wang… [Harvard University] (2026) arxiv.org/abs/2606.16517

262

Machine Learning (ML) Papers

Antonio Montano ☼ retweeted

Machine Learning (ML) Papers @Memoirs

13h

Neural Variability Enhances Artificial Network Robustness Robin Preble, Praveen Venkatesh, Stefan Mihalas, Kameron Decker Harris arxiv.org/abs/2606.13801 [𝚌𝚜.𝙻𝙶 𝚚-𝚋𝚒𝚘.𝙽𝙲]

Neural Variability Enhances Artificial Network Robustness

Neural responses in cortex exhibit substantial trial-to-trial variability in response to repeated stimuli, while peripheral sensory neurons respond far more consistently, leading many to wonder...

arxiv.org

145

fly51fly

Antonio Montano ☼ retweeted

fly51fly @fly51fly

[LG] ExpRL: Exploratory RL for LLM Mid-Training V Xiang, A Setlur, C Blagden, N Haber, A Kumar [Stanford University & CMU & OpenAI] (2026) arxiv.org/abs/2606.17024

602

Machine Learning (ML) Papers

Antonio Montano ☼ retweeted

Machine Learning (ML) Papers @Memoirs

Gefen: Optimized Stochastic Optimizer Nadav Benedek, Tomer Koren, Ohad Fried arxiv.org/abs/2606.13894 [𝚌𝚜.𝙻𝙶 𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙲𝚅]

Gefen: Optimized Stochastic Optimizer

AdamW is a default optimizer for modern deep learning, but its first and second moment states add roughly two parameter-sized buffers to training memory. We propose Gefen, a memory-efficient...

arxiv.org

Software Engineering Papers

Antonio Montano ☼ retweeted

Software Engineering Papers @ComputerPapers

Agent trajectories as programs: fingerprinting and programming coding-agent behavior Hamidah Oderinwale arxiv.org/abs/2606.16988 [𝚌𝚜.𝚂𝙴 𝚌𝚜.𝙻𝙶]

Agent trajectories as programs: fingerprinting and programming...

Benchmark scores tell you what an agent got right; they do not tell you how it got there. In this work, we introduce methods for comparing agents procedurally in different contexts, where the...

arxiv.org

120

Software Engineering Papers

Antonio Montano ☼ retweeted

Software Engineering Papers @ComputerPapers

19h

Specifications for Humans, Agents, and Tooling Mark Marron arxiv.org/abs/2606.15084 [𝚌𝚜.𝚂𝙴]

Specifications for Humans, Agents, and Tooling

Specifications are the central mechanism for communicating intents, requirements, and constraints in software development. When they are explicit, clear, and reliable, they are an effective means...

arxiv.org

136

Machine Learning (ML) Papers

Antonio Montano ☼ retweeted

Machine Learning (ML) Papers @Memoirs

10h

Natively Unlearnable Large Language Models Gaurav R. Ghosal, Pratyush Maini, Aditi Raghunathan arxiv.org/abs/2606.13873 [𝚌𝚜.𝙻𝙶 𝚌𝚜.𝙲𝙻]

Natively Unlearnable Large Language Models

Unlearning aims to remove the influence of specific training data sources, but this has proved challenging because the contributions of different sources are entangled within the model. Isolating...

arxiv.org

208

Jayden Teoh

Antonio Montano ☼ retweeted

Jayden Teoh

@jayden_teoh_

12h

Next-token prediction is myopic. What if transformers learn to predict their own next latent state? 🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀

ALT illustration of next-latent prediction vs. other predictive mechanisms

156

995

67,438

Badr AlKhamissi

Antonio Montano ☼ retweeted

Badr AlKhamissi @bkhmsi

15h

🚨 New Preprint! 🧠 We gave an AI model one simple rule: rearrange your neurons so that nearby ones respond alike. We never told it what a face, a voice, or a sentence was. It grew brain-like maps for all three anyway. 🧵👇 🌐 Website: topo-omni.epfl.ch

7,081

Antonio Montano ☼

Antonio Montano ☼@AntoMon

14h

The New Cognitive Infrastructure of Science – Random Bits of Knowledge 4m4.it/posts/new-cognitive-i…

The New Cognitive Infrastructure of Science – Random Bits of Knowledge

This article interprets recent progress in AI-assisted mathematics as an early signal of a broader transformation: the emergence of a new cognitive infrastructure for science. Starting from Tanya...

4m4.it

Xichen Pan

Antonio Montano ☼ retweeted

Xichen Pan

@xichen_pan

Jun 16

Modern text-to-image models are increasingly powered by large pretrained LLMs. But there is a curious mismatch: the LLM typically encodes the prompt only once, while the evolving noisy latent states are handled entirely by a newly trained generative backbone. Can pretrained multimodal prior participate in the denoising process? Introducing RepFusion. (1/12) 📄 arxiv.org/abs/2606.14700 🌐 xichenpan.com/repfusion/

15,748

SIAM Activity Group on Dynamical Systems

Antonio Montano ☼ retweeted

SIAM Activity Group on Dynamical Systems @DynamicsSIAM

18h

"Learning the Geometry of Data: A Mathematical Review of Shape Space Analysis" (by Gary P. T. Choi, Khanh Dao Duc, Shira Faigenbaum-Golovin, Karen Habermann, Emmanuel Hartman, Christoph von Tycowicz, Chi Zhang, Wenjun Zhao, Felix Zhou): arxiv.org/abs/2606.17022

Learning the Geometry of Data: A Mathematical Review of Shape...

A central objective of machine learning is to identify structure and patterns in data. Advances in data acquisition have increasingly produced datasets whose observations possess rich geometric...

arxiv.org

4,479

Amit LeVi

Antonio Montano ☼ retweeted

Amit LeVi

@AmitLeViAI

Jun 15

VLMs Systematically Fake Visual Understanding Even when VLMs appear to be good at visual understanding, most of their answers are not actually grounded in the image (hallucinated!). We identify two types of hallucinations that appear in up to 98% of answers that seem to demonstrate visual understanding. First, textual biases. The model answers using language patterns, information in the question, and knowledge learned during training, without engaging its visual representations. Second, spurious images. The model constructs false visual content inside its internal representation and then answers as if this imagined content were grounded in the real image. In both cases, the answers may still be correct, but they are not grounded in the visual input at all!!

5,475

Fabien

Antonio Montano ☼ retweeted

Fabien @Fabien_Mikol

19h

Cédric Villani le mois dernier : les LLM ne sont pas intelligents et ne comprennent rien à ce qu'ils racontent, ce ne sont que des machines statistiques réductibles à des fonctions ; la preuve de leur inintelligence, ces modèles se trompent sur l'exemple de la voiture à laver...

6:20

Fabien @Fabien_Mikol

7 Dec 2025

En 6 ans Villani n'a rien changé à son discours. Sa conférence de 2024 en est une belle illustration. Fil avec extraits 🧵 Pour lui les IA ne sont "que des fonctions f(x)=y" et donc "pas plus intelligentes que la formule qui calcule vos impôts". C'est comme ça 🤷‍♂️

1:29

557

396,855

Grigory Sapunov

Antonio Montano ☼ retweeted

Grigory Sapunov

@che_shr_cat

Jun 15

1/ Standard transformers have a fundamental topological flaw: they cannot track dynamic states over time without running out of layers. Once a state representation reaches the top layer of the feedforward stack, the model's ability to update its belief collapses. 🧵

645

119,561

Gabriel Peyré

Antonio Montano ☼ retweeted

Gabriel Peyré

@gabrielpeyre

21h

The alpha version of my new book "Optimal Transport for Machine Learners" is out, with in particular an online version with interactive figures gpeyre.com/ot4ml/

407

27,898

Antonio Montano ☼

Antonio Montano ☼@AntoMon

19h

When a Nobel Laureate Uses an LLM to Prove a Theorem: A Turning Point for Mathematical Discovery – Random Bits of Knowledge 4m4.it/posts/when-a-nobel-la…

When a Nobel Laureate Uses an LLM to Prove a Theorem: A Turning Point for Mathematical Discovery –...

This article analyzes Giorgio Parisi and Francesco Zamponi’s 2026 paper, A Proof of an Identity for the Critical Exponents of Jamming, as a turning point in AI-assisted mathematical discovery. The...

4m4.it

Xingyi Yang

Antonio Montano ☼ retweeted

Xingyi Yang @yxy2168

23h

🚨🌍World models are surprisingly fragile! We introduce BadWorld, an adversarial attack for visual world models. A tiny perturbation to the starting image 🖼️ can break down the whole world. Code:github.com/LinghuiiShen/BadW… Paper:huggingface.co/papers/2606.1… Arxiv:arxiv.org/abs/2606.16519

0:29

136

24,883