Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab

59 Photos and videos

Tweets

Understandable Machine Intelligence Lab @UMI_Lab_AI

22 Sep 2025

🔊 Not to miss …. last month @anna_hedstroem defended her PhD “Evaluation-centric advances in neural model interpretability” at TU Berlin — with distinction! ✨🧠💻☕️ Here’s a thread of a selection of Anna’s evaluation-centric interpretability work what comes next. 🧵

984

more replies

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

22 Sep 2025

Congrats @anna_hedstroem on an outstanding PhD journey! 🎓🧠💻☕️ Full list of works here: 👉 scholar.google.com/citations… Website: annahedstroem.com/

Anna Hedström

ETH Zürich - Cited by 768 - AI Safety - Interpretability - Evaluation - Alignment

scholar.google.com

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

22 Sep 2025

📍 Now @anna_hedstroem is a Postdoctoral Fellow at the @ETH_AI_Center, working with the @ivia_lab and Learning & Adaptive Systems (LAS) group. Anna's focus ahead: evaluation-centric interpretability, LLM steering, and AI safety. ✨🧠💻☕️ More info: annahedstroem.com/.

Laura Kopf

Understandable Machine Intelligence Lab retweeted

Laura Kopf @lkopf_ml

19 Sep 2025

Happy to share that our PRISM paper has been accepted at #NeurIPS2025 🎉 In this work, we introduce a multi-concept feature description framework that can identify and score polysemantic features. 📄 Paper: arxiv.org/abs/2506.15538 #NeurIPS #MechInterp #XAI

0:04

1,075

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

24 Jul 2025

🎉 Huge congratulations to @kirill_bykov, the very first PhD student of our lab, who successfully defended his thesis “Explaining Representations in Deep Neural Networks” this Monday with summa cum laude! 👏 🧵 In the next tweets, we’ll highlight some of his key works:

726

more replies

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

24 Jul 2025

🧐 DORA: Exploring Outlier Representations in Deep Neural Networks (TMLR 2023) A framework for analyzing & detecting learned representations in neural networks. 👉 arxiv.org/abs/2206.04530

DORA: Exploring Outlier Representations in Deep Neural Networks

Deep Neural Networks (DNNs) excel at learning complex abstractions within their internal representations. However, the concepts they learn remain opaque, a problem that becomes particularly acute...

arxiv.org

119

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

24 Jul 2025

📚 During his PhD, Kirill co-authored 11 papers spanning interpretability, neuron analysis & robust explanations. You can find all of them on his Google Scholar: 👉 scholar.google.com/citations… Once again, congrats @kirill_bykov on an outstanding PhD journey! 🎓✨

Kirill Bykov

TU Munich - Cited by 283 - Machine Learning - Explainable AI - Interpretable ML - Mechanistic Interpretability

scholar.google.com

121

Understandable Machine Intelligence Lab

Understandable Machine Intelligence Lab @UMI_Lab_AI

19 Jun 2025

Our latest paper is out! 🚀

Laura Kopf @lkopf_ml

19 Jun 2025

🔍 When do neurons encode multiple concepts? We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity. 📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework arxiv.org/abs/2506.15538 🧵

201

Anna Hedström

Understandable Machine Intelligence Lab retweeted

Anna Hedström

@anna_hedstroem

27 Feb 2025

If you're at #AAAI2025 don't miss our poster today (alignment track)! Paper 📘: arxiv.org/pdf/2502.15403 Code 👩‍💻: github.com/annahedstroem/eva… Team work with @eirasf and @Marina_MCV

Carlos Eiras @eirasf

27 Feb 2025

At 12:30 I'll be happy to take questions about our poster presentation at #AAAI2025. Is your explanation for a model's prediction better than the alternatives? "Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution" introduces QGE... 1/4

514