Joined July 2014
2 Photos and videos
Elad Hoffer retweeted
Thrilled to share our new paper: "LLM Explainability with Counterfactual Chains and Causal Graphs"! 🚀 We introduce a fully automated, model-driven method to extract global, concept-level causal graphs of an LLM's internal reasoning. 📄 arxiv.org/abs/2606.05972 🧵👇 [1/8]
2
12
68
5,280
Excited to share our new arXiv preprint: "Retrieval from Within: An Intrinsic Capability of Attention-Based Models" We introduce INTRA, a framework where attention-based models retrieve from their own internal representations. arxiv.org/abs/2605.05806 1/5 🧵
1
6
25
6,862
On QA benchmarks, INTRA improves both complete-evidence recall and end-to-end answer quality, especially on multi-hop settings where assembling the right evidence matters. It also amortizes context encoding by reusing precomputed encoder states. 4/5 🧵
1
2
171
Elad Hoffer retweeted
Data Movement Is All You Need: A Case Study on Optimizing Transformers by Andrei Ivanov (ETH Zurich); Nikoli Dryden (ETH Zurich)*; Tal Ben-Nun (ETH Zurich); Shigang Li (ETH Zurich); Torsten Hoefler (ETH Zürich)
1
9
22
Two papers accepted to @NipsConference ! - "Norm matters: efficient and accurate normalization schemes in deep networks" as spotlight - "Scalable Methods for 8-bit Training of Neural Networks" as poster
4
Our paper "Train longer, generalize better" got accepted to oral presentation at #nips2017 ! Arxiv preprint: arxiv.org/abs/1705.08741

3
9
New paper: "Train longer, generalize better" arxiv.org/abs/1705.08741 With @PyTorch code available at: github.com/eladhoffer/bigBat…

2
1
Elad Hoffer retweeted
18 Jan 2017
GPU Tensors, Dynamic Neural Networks and deep Python integration. Hello world! pytorch.org
13
486
895