orientino

orientino

3 Photos and videos

Tweets

Alexander Theus retweeted

orientino @orientino_

Apr 23

#ICLR2026 Into mode connectivity, model merging, or permutation invariance? We show how optimization dynamics shape the loss landscape of merged weights. Come check it out! 📅 23/04 10:30AM – 13:00PM 📍 Pavilion 3 P3-1809 w/ @TheusResearch @DamienTeney @orvieto_antonio

170

Weight Space Symmetries @ ICML 2026

Alexander Theus retweeted

Weight Space Symmetries @ ICML 2026 @weightsymmetry

Apr 14

📢 Submissions are OPEN for the Weight Space Symmetry Workshop @icmlconf! ⏰ Deadline extended → April 30 (23:59 AOE) Consider submitting any work related to weight symmetries: optimization, model merging, weight space learning, and so on! #ICML2026 #weightsymmetry2026

6,067

Weight Space Symmetries @ ICML 2026

Alexander Theus retweeted

Weight Space Symmetries @ ICML 2026 @weightsymmetry

Mar 30

📢Excited to announce the Workshop on Weight-Space Symmetries @icmlconf! We welcome 4-page submissions analysing symmetries, their effects on training and model structure, and practical methods to utilize them. Submission Deadline: April 24 (23:59 AoE) #ICML2026

21,774

Alexander Theus

Alexander Theus @TheusResearch

19 Sep 2025

Excited to announce that our paper has been accepted as an Oral at NeurIPS 2025! 🥳

Alexander Theus @TheusResearch

9 Jul 2025

1/ 🚨 New paper alert! 🚨 We explore a key question in deep learning: Can independently trained Transformers be linearly connected in weight space — without a loss barrier? Yes — if you uncover their rich symmetries. 📄 arXiv: arxiv.org/abs/2506.22712

1,131

Alexander Theus

Alexander Theus @TheusResearch

9 Jul 2025

5,700

more replies

Alexander Theus

Alexander Theus @TheusResearch

9 Jul 2025

9/ 🔑 Takeaway: Transformers can be linearly connected — but only if you exploit richer network symmetries. We show that general symmetry alignment (not just permutations) unlocks low-loss paths across ViTs and GPT-2.

522

Alexander Theus

Alexander Theus @TheusResearch

9 Jul 2025

10/ 📄 Paper: arxiv.org/abs/2506.22712 By: @Theus__A , Alessandro Cabodi, @SAnagnostidis , @orvieto_antonio , @unregularized , and @val_boeva 🙏 Huge thanks to my amazing co-authors for this collaboration! #Transformers #LMC #MachineLearning #DeepLearning

Generalized Linear Mode Connectivity for Transformers

Understanding the geometry of neural network loss landscapes is a central question in deep learning, with implications for generalization and optimization. A striking phenomenon is linear mode...

arxiv.org

557