Nirit Nussbaum

Nirit Nussbaum

2 Photos and videos

Tweets

Elad Hoffer retweeted

Jun 7

Thrilled to share our new paper: "LLM Explainability with Counterfactual Chains and Causal Graphs"! 🚀 We introduce a fully automated, model-driven method to extract global, concept-level causal graphs of an LLM's internal reasoning. 📄 arxiv.org/abs/2606.05972 🧵👇 [1/8]

5,280

Elad Hoffer

Elad Hoffer @eladhoffer

May 8

Excited to share our new arXiv preprint: "Retrieval from Within: An Intrinsic Capability of Attention-Based Models" We introduce INTRA, a framework where attention-based models retrieve from their own internal representations. arxiv.org/abs/2605.05806 1/5 🧵

Retrieval from Within: An Intrinsic Capability of Attention-Based Models

Retrieval-augmented generation (RAG) typically treats retrieval and generation as separate systems. We ask whether an attention-based encoder-decoder can instead retrieve directly from its own...

arxiv.org

6,862

more replies

Elad Hoffer

Elad Hoffer @eladhoffer

May 8

On QA benchmarks, INTRA improves both complete-evidence recall and end-to-end answer quality, especially on multi-hop settings where assembling the right evidence matters. It also amortizes context encoding by reusing precomputed encoder states. 4/5 🧵

171

Elad Hoffer

Elad Hoffer @eladhoffer

May 8

Huge thanks to my wonderful co-authors at Nvidia: Yochai Blau, Edan Kinderman, Ron Banner, Daniel Soudry, and Boris Ginsburg. Looking forward to feedback, questions, and discussions! Paper: arxiv.org/abs/2605.05806 5/5 🧵

Retrieval from Within: An Intrinsic Capability of Attention-Based Models

Retrieval-augmented generation (RAG) typically treats retrieval and generation as separate systems. We ask whether an attention-based encoder-decoder can instead retrieve directly from its own...

arxiv.org

136

Alex Dimakis

Elad Hoffer retweeted

Alex Dimakis

@AlexGDimakis

4 Apr 2021

Data Movement Is All You Need: A Case Study on Optimizing Transformers by Andrei Ivanov (ETH Zurich); Nikoli Dryden (ETH Zurich)*; Tal Ben-Nun (ETH Zurich); Shigang Li (ETH Zurich); Torsten Hoefler (ETH Zürich)

Elad Hoffer

Elad Hoffer @eladhoffer

27 Aug 2019

Our new work on training convnets using random image sizes: arxiv.org/abs/1908.08986

Mix & Match: training convnets with mixed image sizes for...

Convolutional neural networks (CNNs) are commonly trained using a fixed spatial image size predetermined for a given model. Although trained on images of aspecific size, it is well established...

arxiv.org

Elad Hoffer

Elad Hoffer @eladhoffer

17 Jun 2019

techcrunch.com/2019/06/17/ha…

Habana Labs launches its Gaudi AI training processor | TechCrunch

Habana Labs, a Tel Aviv-based AI processor startup, today announced its Gaudi AI training processor, which promises to easily beat GPU-based systems by a

techcrunch.com

Adam Fisher

Elad Hoffer retweeted

Adam Fisher

@AdamRFisher

17 Sep 2018

Habana Labs bursts onto the scene with a dedicated AI processor that smokes the competition. eetimes.com/document.asp?doc…

Startup's AI Chip Beats GPU - EE Times

Startup Habana is demonstrating a deep-learning accelerator targeting data centers that outperforms Nvidia’s Volta C100 in inference jobs.

eetimes.com

Elad Hoffer

Elad Hoffer @eladhoffer

5 Sep 2018

Two papers accepted to @NipsConference ! - "Norm matters: efficient and accurate normalization schemes in deep networks" as spotlight - "Scalable Methods for 8-bit Training of Neural Networks" as poster

Elad Hoffer

Elad Hoffer @eladhoffer

5 Sep 2017

Our paper "Train longer, generalize better" got accepted to oral presentation at #nips2017 ! Arxiv preprint: arxiv.org/abs/1705.08741

Elad Hoffer

Elad Hoffer @eladhoffer

25 May 2017

New paper: "Train longer, generalize better" arxiv.org/abs/1705.08741 With @PyTorch code available at: github.com/eladhoffer/bigBat…

PyTorch

Elad Hoffer retweeted

PyTorch

@PyTorch

18 Jan 2017

GPU Tensors, Dynamic Neural Networks and deep Python integration. Hello world! pytorch.org

486

895