Thrilled to share our new paper: "LLM Explainability with Counterfactual Chains and Causal Graphs"! 🚀
We introduce a fully automated, model-driven method to extract global, concept-level causal graphs of an LLM's internal reasoning.
📄 arxiv.org/abs/2606.05972 🧵👇 [1/8]
Excited to share our new arXiv preprint:
"Retrieval from Within: An Intrinsic Capability of Attention-Based Models"
We introduce INTRA, a framework where attention-based models retrieve from their own internal representations.
arxiv.org/abs/2605.05806
1/5 🧵
On QA benchmarks, INTRA improves both complete-evidence recall and end-to-end answer quality, especially on multi-hop settings where assembling the right evidence matters.
It also amortizes context encoding by reusing precomputed encoder states.
4/5 🧵
Huge thanks to my wonderful co-authors at Nvidia: Yochai Blau, Edan Kinderman, Ron Banner, Daniel Soudry, and Boris Ginsburg.
Looking forward to feedback, questions, and discussions!
Paper: arxiv.org/abs/2605.05806
5/5 🧵
Data Movement Is All You Need: A Case Study on Optimizing Transformers by Andrei Ivanov (ETH Zurich); Nikoli Dryden (ETH Zurich)*; Tal Ben-Nun (ETH Zurich); Shigang Li (ETH Zurich); Torsten Hoefler (ETH Zürich)
Two papers accepted to @NipsConference !
- "Norm matters: efficient and accurate normalization schemes in deep networks" as spotlight
- "Scalable Methods for 8-bit Training of Neural Networks" as poster