Negar Arabzadeh

Negar Arabzadeh

62 Photos and videos

Tweets

Pinned Tweet

Negar Arabzadeh

@NegarEmpr

May 12

1/ Thrilled to introduce T³: a corpus for RAG over reasoning tasks, built from thinking traces. We show that surprisingly RAG can improve reasoning— with the right corpus. Rag with Transformed Thinking Traces T³ gain by up to 43.9% on AIME 2025-2026. 🔗 arxiv.org/abs/2605.03344 🧵

211

472,669

Liana

Negar Arabzadeh retweeted

Liana @lianapatel_

Jun 10

🚀 Beyond excited to share we're releasing LOTUSPlan, a new API & optimizer for higher performance LLM-powered data processing, from our team at Berkeley & Stanford. LOTUS now lets you write your LLM-based queries and optimize them for up to 2.4× lower cost and 4.6× higher accuracy for tasks like, agent trace analysis, LLM-judge evals, RAG, document extraction and deep research. ✨Checkout our our new blog: liana313.github.io/blog/lotu… 🧵

16,985

Yichuan Wang

Negar Arabzadeh retweeted

Yichuan Wang

@YichuanM

Jun 10

The web was never meant to be flattened into text. Yet most web RAG systems start by parsing HTML --- a complex and lossy process. 🔥 Introducing PixelRAG: the first RAG system that retrieves and reads 30M web pages as pixels. Instead of extracting text, PixelRAG retrieves screenshots and lets a VLM read them directly. PixelRAG not only preserves visual information, but also outperforms text-based RAG on text-only QA benchmarks by 18.1%. Why? (1) HTML-to-text conversion often discards layout, structure, tables, and other useful signals. (2) We continued pretraining a VLM on web page screenshots and turned it into a surprisingly strong visual retriever. (3) Recent VLMs are remarkably good at understanding web pages, often with better accuracy and token efficiency than text-only pipelines. Takeaway: HTML parsing may be one of the biggest self-inflicted bottlenecks in web RAG. Demo below 👇 Code: github.com/StarTrail-org/Pix… Paper: github.com/StarTrail-org/Pix… Playground: pixelrag.ai/

0:56

116

693

72,003

TREC RAG @ 2026

Negar Arabzadeh retweeted

TREC RAG @ 2026 @TREC_RAG

Jun 4

Search is becoming increasingly agentic: systems plan, search, synthesize, cite, and revise. But, how should we study and evaluate these systems? 🤔 In TREC RAG 2026, we want to build a reusable collection for this new reality We’ve aligned on 4 core directions 🧵👇

1,861

Sajad Ebrahimi

Negar Arabzadeh retweeted

Sajad Ebrahimi @sadjadeb

Jun 3

Excited to announce that our paper “From Noise to Order: Learning to Rank via Denoising Diffusion” has been accepted to #ICTIR2026! 🎉📚 📄Paper: arxiv.org/pdf/2602.11453 💻 Code: github.com/sadjadeb/Diffusio…

509

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 27

Grateful that my PhD thesis was recognized as one of the top dissertations in the 2026 Faculty of Mathematics Doctoral Prize at the @UWaterloo ! 🎉 And it is always especially nice to hear kind words from your PhD supervisor @claclarke . I guess that feeling never really goes away, even after you graduate. 😊 uwaterloo.ca/computer-scienc…

Computer science students win prestigious Faculty of Mathematics Doctoral Prizes | Cheriton School...

From advancing HCI to cloud computing research, our alumni have demonstrated exceptional research and academic achievements

uwaterloo.ca

6,860

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 26

Happy to share that our @icmlconf paper "Measuring Agents in Production" received an Oral Presentation spot! 🌟 arxiv.org/abs/2512.04123 See you all in Seoul! 🇰🇷

Measuring Agents in Production

LLM-based agents already operate in production across many industries, yet we lack an understanding of what technical methods make deployments successful. We present the first systematic study of...

arxiv.org

Melissa Pan

@melissapan

Apr 30

Excited to share: MAP has been accepted as 🌟 ICML Spotlight 🌟 We hope MAP can provide data-driven insights that help the communities to work on various under-explored research directions around agent systems! Huge thanks & congrats to my amazing co-authors. See you all at Seoul! 🫡

2,044

Melissa Pan

Negar Arabzadeh retweeted

Melissa Pan

@melissapan

May 24

Excited to share that MAP has been selected for ✨ICML Oral✨ We look forward to sharing the insights in the paper with the community And much much appreciations to everyone who participated in our study ❤️ MAP won’t be possible without your contribution to open science

Melissa Pan

@melissapan

Apr 30

170

32,460

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 12

211

472,669

more replies

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 12

5/ Interestingly, RAG over T³ can be cheaper than No RAG. Retrieved reasoning shifts work from expensive output tokens to cheap input tokens — the model thinks less and reads more. Think less. Retrieve thinking. 🧠

385

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 12

6/ Code, corpora, prompts — all open: 🔗 github.com/Narabzad/t3 Transformed corpora available on Hugging Face. Thanks to my amazing coauthors @wenjie_ma , @sewon__min , and @matei_zaharia 🙏

GitHub - Narabzad/t3

Contribute to Narabzad/t3 development by creating an account on GitHub.

github.com

396

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

May 10

I’m glad to be part of this initiative!

Tetsuya Sakai (酒井哲也)@tetsuyasakai

May 8

In 2007, Mark @IR_oldie and I launched EVIA, a workshop on information access evaluation methods collocated with #ntcir6 . Now in 2026, Negar @NegarEmpr and I are serving as PC co-chairs of #evia2026, which will take place on Day 3 (Dec 10) of #ntcir19 . CFP in preparation..

382

Tetsuya Sakai (酒井哲也)

Negar Arabzadeh retweeted

Tetsuya Sakai (酒井哲也)@tetsuyasakai

May 10

#evia2026 #ntcir19 #ntcir research.nii.ac.jp/ntcir/ntc…

Tetsuya Sakai (酒井哲也)@tetsuyasakai

May 9

Call for Papers The 12th International Workshop on Evaluating Information Access (EVIA 2026) Submission deadline: September 1, 2026 (AoE) Workshop date: December 10, 2026 (Japan Time) Venue: National Institute of Informatics, Tokyo, Japan. #ntcir #ntcir19 #evia2026

465

Diane

Negar Arabzadeh retweeted

Diane @dianetc_

May 6

We set out to build a better retriever, so we looked for the hardest IR benchmarks. For each, we asked how much headroom remained by running oracle reranking with a frontier LLM. Most had little room left! So we built OBLIQ-Bench to study much harder search queries than before.

281

147,819

Parth Asawa

Negar Arabzadeh retweeted

Parth Asawa

@pgasawa

May 4

Today, we’re releasing Continual Learning Bench 1.0: the first, realistic benchmark for measuring how AI systems can improve in online settings. Benchmarks today assume models are stateless. Each example is independent, and once a system finishes a task, it moves on as if nothing happened. But deployed AI systems should learn from experience. We tested 10 frontier systems against novel, expert-validated tasks and find there’s still plenty of headroom for learning. (1/n)

168

1,186

833,677

Ion Stoica

Negar Arabzadeh retweeted

Ion Stoica

@istoica05

Apr 30

Congratulation to the team for the MAP paper being accepted as an ICML spotlight! A key takeaway from this work is that reliability remains one of the central challenges for production agent systems. Simple yet effective methods continue to dominate in these agent systems for…

Melissa Pan

@melissapan

Apr 30

8,928

Negar Arabzadeh

Negar Arabzadeh

@NegarEmpr

Apr 30

So excited to share that my first ever @icmlconf paper has been accepted as a Spotlight! ✨ Grateful, happy, and incredibly excited about this work! See you all in Seoul!🇰🇷

Melissa Pan

@melissapan

Apr 30

1,897