Jimmy Lin

Jimmy Lin

429 Photos and videos

Tweets

Jimmy Lin

@lintool

Jun 8

I keep confusing myself... grep is all you need? ❌ bm25 is all you need? ❌ wait, you need both! ✅

Shengyao Zhuang @ShengyaoZhuang

Jun 8

Giving search agents access to bash tools for interacting with documents is powerful, but not scalable. The retriever’s new role is to retrieve a bounded interaction space, making bash-based agentic search practical at scale. 📝 arxiv.org/abs/2606.06880 👨🏽‍💻 github.com/texttron/RISE

21,107

Jimmy Lin

Jimmy Lin

@lintool

Jun 8

🎉 Happy to see @mattjustram joining @mixedbreadai and @rpradeep42 joining @DbrxMosaicAI - grep is all you need? ❌ bm25 is all you need? ❌ both wrong - talent is all you need ✅

4,614

Jheng-Hong Yang

Jimmy Lin retweeted

Jheng-Hong Yang

@mattjustram

Jun 8

personal update: happy to share that i've joined @mixedbreadai 🍞 as a member to work on technical stuffs & agents & search! too happy to meet so many great bakers and working on exciting projects, so i missed the perfect timing (June 1st, Mon) to say this :P

5,762

Ronak Pradeep

Jimmy Lin retweeted

Ronak Pradeep @rpradeep42

Jun 8

Life Update: Today I’ll be joining @Databricks / @DbrxMosaicAI. Excited to push the frontier of information-seeking agents in the wild!

123

14,650

Yuntian Deng

Jimmy Lin retweeted

Yuntian Deng

@yuntiandeng

Jun 3

I built a 3D character you can control with language instead of predefined buttons. How? I compiled a neural program that turns language instructions into movements, using ProgramAsWeights. Just type: "act excited, wave, dance, then sit proudly" Try it: programasweights.com/avatar

1,213

TREC RAG @ 2026

Jimmy Lin retweeted

TREC RAG @ 2026 @TREC_RAG

Jun 1

Search is no longer just a ranked list...LLM agents can now query, inspect, reformulate, and decide when to stop 🤖 At TREC RAG 2026, we’re introducing new metrics for agentic search: evaluating not only final results, but the search process itself 📊 Stay tuned!

553

TREC RAG @ 2026

Jimmy Lin retweeted

TREC RAG @ 2026 @TREC_RAG

May 29

🤨 Is your agent confused about what to build because it says there aren’t any guidelines? Now your agent has no more excuses - track guidelines for TREC RAG 2026 are out 🔥 And yes, they’re available via SKILLz 😎 Tell your agents to showcase your agentic search system!

2,428

TREC RAG @ 2026

Jimmy Lin retweeted

TREC RAG @ 2026 @TREC_RAG

May 15

Does retrieval help RAG or did the LLM already memorize the answer? 🤔 Too often, the overlap between RAG corpora and what LLMs “know” is unclear Better RAG evaluation needs tighter alignment between NLP and IR 📚 That's why for RAG 2026 we are using @nvidia's ClimbMix corpus

6,755

Jimmy Lin

Jimmy Lin

@lintool

May 15

Since we're counting model parameters, let me introduce you to a two-parameter model for agentic search that's awesome: It's called BM25. I haven't tried it yet, but I think fp4 will work fine. arxiv.org/abs/2605.10848

Rethinking Agentic Search with Pi-Serini: Is Lexical Retrieval Sufficient?

Does a lexical retriever suffice as large language models (LLMs) become more capable in an agentic loop? This question naturally arises when building deep research systems. We revisit it by...

arxiv.org

6,637

Jimmy Lin

Jimmy Lin

@lintool

May 15

But I think we can do better... what about zero parameters? Let me introduce you to something else that's awesome: It's called grep. arxiv.org/abs/2605.05242

Beyond Semantic Similarity: Rethinking Retrieval for Agentic...

Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This...

arxiv.org

746

Jimmy Lin

Jimmy Lin

@lintool

May 14

I think @xueguang_ma is being too modest, so I'll provide context: he along with @rpradeep42 and a UWaterloo ugrad (Kai Sun) popularized hybrid search in its current form. So, if you're using hybrid search today, thank them. 🙏 Yes, this is clickbait-y, so I'll support my claims 🧵

Xueguang Ma

@xueguang_ma

May 13

This plot reminds me of my first IR work reproducing DPR in Pyserini, where we found BM25 is amazingly helpful when hybrid with a dense retriever. BM25 is never just a simple baseline -- used the right way, it can easily outperform many fancy methods. BM25 was the most robust method shown in BEIR, the most effective and efficient method for long-context search shown in LongEmbed, and now @mattjustram and @xuzihuan4 show that BM25 can push the search agents into the best efficiency frontier. p.s. Pyserini and pi-serini are two different repos.

5,624

more replies

Jimmy Lin

Jimmy Lin

@lintool

May 14

But that's not what we found: even with DPR, a dense-sparse hybrid with BM25 is significantly better than DPR alone. arxiv.org/abs/2104.05740

818

Jimmy Lin

Jimmy Lin

@lintool

May 14

Thus, our conclusions: This I believe is the first demonstration of the need for hybrid search. Hence the claim that hybrid search is a @UWaterloo innovation. You're welcome! The broader lesson is that old baselines are still surprisingly important. Let's not forget them.

4,468

Jheng-Hong Yang

Jimmy Lin retweeted

Jheng-Hong Yang

@mattjustram

May 14

x.com/i/article/205479183898…

2,224

Jheng-Hong Yang

Jimmy Lin retweeted

Jheng-Hong Yang

@mattjustram

May 12

someone already wrote a love letter to pi, by @badlogicgames. so we wrote a love paper to pi :) with my teammates @xuzihuan4 and @lintool. a few days ago, i promised i’d share some fun plots once Pi-Serini joined the BrowseComp-Plus deep research agent party. now, it’s about time. here weeeee goooooo. bear with the sloppy images first. the serious one is at the end. the question was simple: how far can we push deep research with BM25 pi? turns out: weirdly far.

17,106

TREC RAG @ 2026

Jimmy Lin retweeted

TREC RAG @ 2026 @TREC_RAG

May 12

TREC RAG is returning for 2026! 🎉 This year’s iteration is special because agents 🤖 can join the fun… but what might agent-first community evaluation look like? 🧵👇

895

Tz-Huan Hsu

Jimmy Lin retweeted

Tz-Huan Hsu @xuzihuan4

May 12

Does a lexical retriever suffice for agentic search when agents can keep refining their queries? As LLMs become more capable in agentic loops, agents can continuously refine their actions based on environmental feedback. We couldn’t help but ask the question above.

1,779

Jimmy Lin

Jimmy Lin

@lintool

May 11

What I'm cooking up... 👨‍🍳

5,262

Zhuofeng Li

Jimmy Lin retweeted

Zhuofeng Li

@zhuofengli96475

May 8

🔥 Introducing Direct Corpus Interaction (DCI)! The best retriever for agentic search is no retriever. 🚀 We replaced the entire agentic search pipeline — embedding model, vector index, top-k retrieval — with only `grep` and `bash`. 🔧 📄 Paper: huggingface.co/papers/2605.0… DCI unlocks the full agentic potential of any Claude Sonnet 4.6: 69.0% → 80.0% on BrowseComp-Plus ( 11.0, −$424). 💡The Magic: The agent searches the raw corpus directly — `grep`, `find`, `bash`, shell pipelines — exactly like a coding agent navigating a codebase. No preprocess. No embedding model. No vector index. No offline indexing. 📊The Results: DCI outperforms top baselines across 13 benchmarks, with average gains of: 🔍 Agentic Search: 11.0% 🧠 Multi-hop QA: 30.7% 📈 IR Ranking: 21.5% 💡 Insights: Beyond accuracy, we conduct a series of controlled ablation studies to pinpoint the sources of DCI’s gains. Specifically, we examine trajectory-level search, evidence utilization corpus, context management, and tool usage (RQ2-RQ6). Try it yourself! 🛠️Code: github.com/DCI-Agent/DCI-Age… 🤖 Demo: huggingface.co/spaces/DCI-Ag… 🔎 Eval logs: huggingface.co/datasets/DCI-…

262

75,646