I profess CS-ly at @UWaterloo. Previously, I monkeyed code for @Twitter, slides for @Cloudera, and scienced for @yupp_ai.

Joined February 2010
429 Photos and videos
I keep confusing myself... grep is all you need? โŒ bm25 is all you need? โŒ wait, you need both! โœ…
Giving search agents access to bash tools for interacting with documents is powerful, but not scalable. The retrieverโ€™s new role is to retrieve a bounded interaction space, making bash-based agentic search practical at scale. ๐Ÿ“ arxiv.org/abs/2606.06880 ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป github.com/texttron/RISE
2
3
49
21,107
๐ŸŽ‰ Happy to see @mattjustram joining @mixedbreadai and @rpradeep42 joining @DbrxMosaicAI - grep is all you need? โŒ bm25 is all you need? โŒ both wrong - talent is all you need โœ…
5
4
46
4,614
Jimmy Lin retweeted
personal update: happy to share that i've joined @mixedbreadai ๐Ÿž as a member to work on technical stuffs & agents & search! too happy to meet so many great bakers and working on exciting projects, so i missed the perfect timing (June 1st, Mon) to say this :P
14
4
62
5,762
Jimmy Lin retweeted
Life Update: Today Iโ€™ll be joining @Databricks / @DbrxMosaicAI. Excited to push the frontier of information-seeking agents in the wild!
21
3
123
14,650
Jimmy Lin retweeted
I built a 3D character you can control with language instead of predefined buttons. How? I compiled a neural program that turns language instructions into movements, using ProgramAsWeights. Just type: "act excited, wave, dance, then sit proudly" Try it: programasweights.com/avatar
5
16
1,213
Jimmy Lin retweeted
Search is no longer just a ranked list...LLM agents can now query, inspect, reformulate, and decide when to stop ๐Ÿค– At TREC RAG 2026, weโ€™re introducing new metrics for agentic search: evaluating not only final results, but the search process itself ๐Ÿ“Š Stay tuned!
3
3
4
553
Jimmy Lin retweeted
๐Ÿคจ Is your agent confused about what to build because it says there arenโ€™t any guidelines? Now your agent has no more excuses - track guidelines for TREC RAG 2026 are out ๐Ÿ”ฅ And yes, theyโ€™re available via SKILLz ๐Ÿ˜Ž Tell your agents to showcase your agentic search system!
1
11
13
2,428
Jimmy Lin retweeted
Does retrieval help RAG or did the LLM already memorize the answer? ๐Ÿค” Too often, the overlap between RAG corpora and what LLMs โ€œknowโ€ is unclear Better RAG evaluation needs tighter alignment between NLP and IR ๐Ÿ“š That's why for RAG 2026 we are using @nvidia's ClimbMix corpus
15
7
20
6,755
I think @xueguang_ma is being too modest, so I'll provide context: he along with @rpradeep42 and a UWaterloo ugrad (Kai Sun) popularized hybrid search in its current form. So, if you're using hybrid search today, thank them. ๐Ÿ™ Yes, this is clickbait-y, so I'll support my claims ๐Ÿงต
This plot reminds me of my first IR work reproducing DPR in Pyserini, where we found BM25 is amazingly helpful when hybrid with a dense retriever. BM25 is never just a simple baseline -- used the right way, it can easily outperform many fancy methods. BM25 was the most robust method shown in BEIR, the most effective and efficient method for long-context search shown in LongEmbed, and now @mattjustram and @xuzihuan4 show that BM25 can push the search agents into the best efficiency frontier. p.s. Pyserini and pi-serini are two different repos.
1
6
42
5,624
But that's not what we found: even with DPR, a dense-sparse hybrid with BM25 is significantly better than DPR alone. arxiv.org/abs/2104.05740
1
5
818
Thus, our conclusions: This I believe is the first demonstration of the need for hybrid search. Hence the claim that hybrid search is a @UWaterloo innovation. You're welcome! The broader lesson is that old baselines are still surprisingly important. Let's not forget them.
2
11
4,468
Jimmy Lin retweeted

4
18
2,224
Jimmy Lin retweeted
someone already wrote a love letter to pi, by @badlogicgames. so we wrote a love paper to pi :) with my teammates @xuzihuan4 and @lintool. a few days ago, i promised iโ€™d share some fun plots once Pi-Serini joined the BrowseComp-Plus deep research agent party. now, itโ€™s about time. here weeeee goooooo. bear with the sloppy images first. the serious one is at the end. the question was simple: how far can we push deep research with BM25 pi? turns out: weirdly far.
5
11
62
17,106
Jimmy Lin retweeted
TREC RAG is returning for 2026! ๐ŸŽ‰ This yearโ€™s iteration is special because agents ๐Ÿค– can join the funโ€ฆ but what might agent-first community evaluation look like? ๐Ÿงต๐Ÿ‘‡
1
4
7
895
Jimmy Lin retweeted
Does a lexical retriever suffice for agentic search when agents can keep refining their queries? As LLMs become more capable in agentic loops, agents can continuously refine their actions based on environmental feedback. We couldnโ€™t help but ask the question above.
1
2
19
1,779
What I'm cooking up... ๐Ÿ‘จโ€๐Ÿณ
4
4
59
5,262
Jimmy Lin retweeted
๐Ÿ”ฅ Introducing Direct Corpus Interaction (DCI)! The best retriever for agentic search is no retriever. ๐Ÿš€ We replaced the entire agentic search pipeline โ€” embedding model, vector index, top-k retrieval โ€” with only `grep` and `bash`. ๐Ÿ”ง ๐Ÿ“„ Paper: huggingface.co/papers/2605.0โ€ฆ DCI unlocks the full agentic potential of any Claude Sonnet 4.6: 69.0% โ†’ 80.0% on BrowseComp-Plus ( 11.0, โˆ’$424). ๐Ÿ’กThe Magic: The agent searches the raw corpus directly โ€” `grep`, `find`, `bash`, shell pipelines โ€” exactly like a coding agent navigating a codebase. No preprocess. No embedding model. No vector index. No offline indexing. ๐Ÿ“ŠThe Results: DCI outperforms top baselines across 13 benchmarks, with average gains of: ๐Ÿ” Agentic Search: 11.0% ๐Ÿง  Multi-hop QA: 30.7% ๐Ÿ“ˆ IR Ranking: 21.5% ๐Ÿ’ก Insights: Beyond accuracy, we conduct a series of controlled ablation studies to pinpoint the sources of DCIโ€™s gains. Specifically, we examine trajectory-level search, evidence utilization corpus, context management, and tool usage (RQ2-RQ6). Try it yourself! ๐Ÿ› ๏ธCode: github.com/DCI-Agent/DCI-Ageโ€ฆ ๐Ÿค– Demo: huggingface.co/spaces/DCI-Agโ€ฆ ๐Ÿ”Ž Eval logs: huggingface.co/datasets/DCI-โ€ฆ
25
61
262
75,646