Joined June 2023
18 Photos and videos
Data Mining Group@UIUC retweeted
Introducing Harness-1, a 20B search agent trained with a state-externalizing harness. > frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4 > Context-1-level cost and latency > externalizes candidates, evidence, verification, and search history > open-source
90
272
2,959
264,639
Data Mining Group@UIUC retweeted
CoRank accepted at KDD 2026 ๐ŸŽ‰๐ŸŽ‰ Really grateful to all the collaborators from @dmguiuc โ€” @PatrickXu565299, Bowen @BowenJin13, Prof. Seongku Kang, and Prof. Jiawei Han โค๏ธโค๏ธ
๐Ÿค” LLMs not finding the right papers for RAG or anything else? ๐Ÿš€ Simple preprocessing can help a lot! ๐Ÿ” Check out our new work on LLM reranking for scientific retrieval. ๐Ÿ”— arxiv.org/abs/2505.13757
7
29
4,013
Data Mining Group@UIUC retweeted
๐Ÿšจ New Paper Alert! ๐Ÿšจ Excited to share our latest work: Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation ๐Ÿ๐Ÿ“š๐Ÿค–
4
6
21
1,465
Data Mining Group@UIUC retweeted
See you in San Diego for @aclmeeting #ACL2026NLP! ๐Ÿฅ‚ Thrilled to share that our paper has been accepted. Iโ€™m looking forward to sharing more about our research soon!
Maintaining agent performance over long horizons remains challengingโ€”largely because memory systems fail to associate latent context with intent. ๐ŸŽ‰ Introducing our paper: Grounding Agent Memory in Contextual Intent. STITCH achieves 35.6% gains on our new CAME-Bench.
4
4
617
Data Mining Group@UIUC retweeted
๐Ÿ“ฐNew preprint: How can we build a task-agnostic plug-and-play memory module for LLM agents that supports multiple memory types? We present PlugMem๐Ÿ”Œ๐Ÿง , a plugin memory module that works across tasks by turning heterogeneous experience into knowledge. Evaluated unchanged on long-term dialogue๐Ÿ—ฃ๏ธ, multi-hop QA๐Ÿ•ต๏ธ, and web agents๐Ÿ•ธ๏ธ๐Ÿค–, PlugMem improves performance while using far fewer memory tokens. ๐Ÿ“œPaper: empathyang.github.io/files/Pโ€ฆ ๐Ÿ”จCode: github.com/TIMAN-group/PlugMโ€ฆ
13
63
169
12,516
Data Mining Group@UIUC retweeted
Maintaining agent performance over long horizons remains challengingโ€”largely because memory systems fail to associate latent context with intent. ๐ŸŽ‰ Introducing our paper: Grounding Agent Memory in Contextual Intent. STITCH achieves 35.6% gains on our new CAME-Bench.
6
1
3
1,079
Data Mining Group@UIUC retweeted
1 Mar 2025
๐Ÿ“ฃ Excited to share #DeepRetrieval - our novel approach using reinforcement learning for query augmentation in information retrieval! ๐Ÿš€ Our preliminary results (we got on Feb 16) CRUSH previous SOTA: 60.8% vs 24.7% recall on PubMed search engine 70.8% vs 32.1% recall on ClinicalTrial search engine with a SMALLER model (3B vs 7B) ๐Ÿ’กNO supervision data: - [no๐Ÿ’ฐ] vs [๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ๐Ÿ’ฐ...] on creating augmented queries from ChatGPT/Claude! ๐Ÿ’ป Github: github.com/pat-jj/DeepRetrieโ€ฆ ๐Ÿ“ Preliminary Technical Report: pat-jj.github.io/assets/pdf/โ€ฆ ๐Ÿ”ฌ Currently testing on general IR datasets and with dense retrieval methods ๐Ÿ“ Full paper with more results will be released soon. Just created this X account to share this breakthrough - follow for more NLP IR research! #NLP #IR #MachineLearning #LLM #AAAI2025
3
64
43
29,729
Data Mining Group@UIUC retweeted
28 Feb 2025
๐Ÿš€ Introducing ๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต-๐—ฅ๐Ÿญ โ€“ the first ๐—ฟ๐—ฒ๐—ฝ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐—ผ๐—ณ ๐——๐—ฒ๐—ฒ๐—ฝ๐˜€๐—ฒ๐—ฒ๐—ธ-๐—ฅ๐Ÿญ (๐˜‡๐—ฒ๐—ฟ๐—ผ) for training reasoning and search-augmented LLM agents with reinforcement learning! This is a step towards training an ๐—ผ๐—ฝ๐—ฒ๐—ป-๐˜€๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ ๐—ข๐—ฝ๐—ฒ๐—ป๐—”๐—œ โ€œ๐——๐—ฒ๐—ฒ๐—ฝ ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ตโ€ via RL. Our ๐Ÿฏ๐—• ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—Ÿ๐—Ÿ๐— ๐˜€โ€”including not just ๐—ค๐˜„๐—ฒ๐—ป ๐Ÿฎ.๐Ÿฑ but also ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ ๐Ÿฏ.๐Ÿฎโ€”learn to ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ผ๐—ป and ๐—ฐ๐—ฎ๐—น๐—น ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ฒ๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐˜€ all on their own! Everything will be ๐—ณ๐˜‚๐—น๐—น๐˜† ๐—ผ๐—ฝ๐—ฒ๐—ป ๐˜€๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ. Stay tuned! Code: github.com/PeterGriffinJin/Sโ€ฆ Experimental logs: wandb.ai/peterjin/Search-R1-โ€ฆ #R1 #deepresearch #deepseek
42
316
2,530
364,187
Data Mining Group@UIUC retweeted
20 Sep 2024
Super excited that this work has been accepted to the #EMNLP2024 main conference! See you in Miami!๐ŸŽ‰
18 Jun 2024
๐Ÿ“ขWe have finally turned our "awesome" GitHub repository (290 stars already) into a survey of ๐’๐œ๐ข๐ž๐ง๐ญ๐ข๐Ÿ๐ข๐œ ๐‹๐‹๐Œ๐ฌ and their applications in ๐’๐œ๐ข๐ž๐ง๐ญ๐ข๐Ÿ๐ข๐œ ๐ƒ๐ข๐ฌ๐œ๐จ๐ฏ๐ž๐ซ๐ฒ! #LLM #AI4Science Paper: arxiv.org/abs/2406.10833 GitHub: github.com/yuzhimanhua/Awesoโ€ฆ
5
11
93
10,978
Data Mining Group@UIUC retweeted
11 Oct 2024
๐Ÿš€Excited to share "InstructG2I: Synthesizing Images from Multimodal Attributed Graphs" has been accepted by @NeurIPSConf 2024! instructg2i.github.io/ We propose a graph-conditioned stable diffusion model for image generation. GO and PLAY with it! #graph #diffusion #neurips
1
2
22
7,487
Data Mining Group@UIUC retweeted
14 Oct 2024
๐ŸŽ“Successfully defended my Ph.D. thesis! ๐ŸŽ‰My deepest gratitude goes to my thesis committee members: Prof. Jiawei Han @dmguiuc, Prof. Tarek Abdelzaher, Prof. Hanghang Tong, Prof. Wei Wang @WeiWang1973, and Dr. Iris Shen!
22
4
197
14,867
Data Mining Group@UIUC retweeted
Happy to announce that TreeInstruct got accepted to EMNLP'24! Excited to discuss the work alongside @wonderingishika as part of a joint collaboration between @dmguiuc and @convai_uiuc. See you all in Miami! #EMNLP2024
Can LLMs make us critical thinkers? TreeInstruct reorients LLMs to be instructors that guide students socratically to solve problems, instead of assistants that provide direct answers. Check out arxiv.org/abs/2406.11709 (w/ @wonderingishika) to learn more!
5
20
5,094
Our group has 6 papers accepted to #EMNLP2024 , led by Linyi Ding, SeongKu Kang, @yuz9yuz, @SizheZhou189667 , @priyanka_karg and @Siru_Ouyang respectively. See you in Miami! @emnlpmeeting
1
3
51
4,060
Our alumni, Yu Meng, won the #KDD2024 Outstanding Dissertation Award!!! Congratulations on this well-earned distinction, Yu ! ๐ŸŽ‰ @yumeng0818 @kdd_news
3
4
64
17,565
๐Ÿš€ Join our tutorial at #KDD2024, Automated Mining of Structured Knowledge from Text with Large Language Models! ๐Ÿ‘คPresented by @YunyiZhang10, @Siru_Ouyang, Professor Jiawei Han. ๐Ÿ“…Aug 25, 10 AM - 1 PM CEST ๐Ÿ“ Room 129-130
5
14
1,747
Data Mining Group@UIUC retweeted
18 Jun 2024
๐Ÿ“ขWe have finally turned our "awesome" GitHub repository (290 stars already) into a survey of ๐’๐œ๐ข๐ž๐ง๐ญ๐ข๐Ÿ๐ข๐œ ๐‹๐‹๐Œ๐ฌ and their applications in ๐’๐œ๐ข๐ž๐ง๐ญ๐ข๐Ÿ๐ข๐œ ๐ƒ๐ข๐ฌ๐œ๐จ๐ฏ๐ž๐ซ๐ฒ! #LLM #AI4Science Paper: arxiv.org/abs/2406.10833 GitHub: github.com/yuzhimanhua/Awesoโ€ฆ
9
75
308
44,953
Our group has 2 papers on fine-grained entity typing accepted to #KDD2024, led by @tanaykomarlu and @Siru_Ouyang, respectively. See you in Barcelona! @kdd_news
1
11
944
Data Mining Group@UIUC retweeted
3 May 2024
๐Ÿš€Excited to share "Language Models as Semantic Indexers" is accepted to ICML 2024! โญ๏ธWe propose to learn document semantic IDs with large language models in a self-supervised fashion. โญ๏ธThe learned semantic IDs can benefit LLM generative recommendation and retrieval. #LLM #IR
1
9
34
3,121
Our group has 3 papers (1 main 2 findings) accepted to #ACL2024, led by @XianruiZhong, @Yizhu_Jiao, and @BowenJin13, respectively. See you in Bangkok! @emnlpmeeting
3
27
1,902
Prof. Jiawei Han received the Distinguished Research Contributions Award at #PAKDD2024! @pakdd_social Photo credit: Prof. Jian Pei @jian_pei
9
59
16,060