Information Retrieval Papers (@CsIR_papers) | Aguea

490 Photos and videos

Tweets

Information Retrieval Papers @CsIR_papers

Jun 12

CQC-RAG: Robust Retrieval-Augmented Generation via Cross-Query Consistency Yanjia Sun, Sifan Liu, Jie Shao arxiv.org/abs/2606.13438 [𝚌𝚜.𝙸𝚁]

CQC-RAG: Robust Retrieval-Augmented Generation via Cross-Query Consistency

Retrieval-Augmented Generation (RAG) has become a common approach for improving the factuality of Large Language Models (LLMs), yet its reliability remains highly sensitive to how external...

12

Information Retrieval Papers @CsIR_papers

Jun 12

CoDeR: Local Constraint-Compatible Retrieval Beyond Semantic Similarity Xingkun Yin, Xuebin Tang, Hongyang Du arxiv.org/abs/2606.13204 [𝚌𝚜.𝙸𝚁]

CoDeR: Local Constraint-Compatible Retrieval Beyond Semantic Similarity

Information retrieval systems have long treated semantic similarity as a proxy for relevance. For constraint-sensitive queries, this proxy can fail when a document is topically close to the query...

6

Information Retrieval Papers @CsIR_papers

Jun 12

The Clustering Strikes Back: Building Cost-Effective and High-Performance ANNS at Scale with Helmsman Yuchen Huang, Baiteng Ma, Yiping Sun, Yang Shi, Xiao Chen, Xiaocheng Zhong, Zhiyong Wang, Yao Hu, Erci Xu, … arxiv.org/abs/2606.13145 [𝚌𝚜.𝙸𝚁] 💬Accepted by OSDI'26

The Clustering Strikes Back: Building Cost-Effective and...

RedNote (a.k.a., Xiaohongshu, a global-scale social network platform) widely adopts approximate nearest neighbor search (ANNS) to power its search, recommendation, and advertising services. Due to...

15

Information Retrieval Papers @CsIR_papers

Jun 12

CFALR: Collaborative Filtering-Augmented Large Language Model for Personalized Fashion Outfit Recommendation Yujuan Ding, Junrong Liao, Yunshan Ma, Yi Bin, Wenqi Fan, Tat-Seng Chua, Qing Li arxiv.org/abs/2606.13001 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙼𝙼]

CFALR: Collaborative Filtering-Augmented Large Language Model for...

Personalized outfit recommendation poses a significant challenge in e-commerce and social media platforms, requiring systems that balance user preferences with aesthetic compatibility....

9

Information Retrieval Papers @CsIR_papers

Jun 12

Charge as a Construct-Validity Factor in Chinese Legal Case Retrieval: A Cross-Benchmark Audit Yao Liu, Tien-Ping Tan, Zhilan Liu arxiv.org/abs/2606.12993 [𝚌𝚜.𝙸𝚁]

Charge as a Construct-Validity Factor in Chinese Legal Case...

Chinese Legal Case Retrieval (LCR) benchmarks grade a reference judgment relevant when its legal characterization matches the query, and strong systems now reach NDCG@10 of 0.85-0.88. Most of the...

8

Information Retrieval Papers @CsIR_papers

Jun 12

Trait, Not State: The Durability of Reading Identity in Social Highlighting Kazuki Nakayashiki, Keisuke Watanabe arxiv.org/abs/2606.12904 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙷𝙲 𝚌𝚜.𝚂𝙸]

Trait, Not State: The Durability of Reading Identity in Social Highlighting

Prior work on a social web highlighter located individuality in selection -- which documents a person chooses to highlight -- but measured it cross-sectionally. We ask the temporal question: is a...

4

Information Retrieval Papers @CsIR_papers

Jun 11

DiffCold: A Diffusion-based Generative Model for Cold-Start Item Recommendation Kangning Zhang, Yingjie Qin, Weinan Zhang, Yong Yu, Jianghao Lin arxiv.org/abs/2606.12245 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙰𝙸] 💬Accepted by ECML-PKDD 2026

DiffCold: A Diffusion-based Generative Model for Cold-Start Item...

Cold-start item recommendation remains a persistent challenge in real-world systems due to the absence of interaction histories. While prior models attempt to bridge this gap using item content...

13

Information Retrieval Papers @CsIR_papers

Jun 11

LLM-Based User Personas for Recommendations at Scale Haoting Wang, Haokai Lu, Zheyun Feng, Jenny Huang, Yifat Amir, Gregory Hinkson, Ben Most, Zelong Zhao, Yixin Kelly Cui, Rein Zhang, Fabio Soldo, Yu Xia, Nihar Bhupalam, Minmin Chen, … arxiv.org/abs/2606.12198 [𝚌𝚜.𝙸𝚁]

LLM-Based User Personas for Recommendations at Scale

Large Language Models (LLMs) offer unprecedented potential for enhancing recommendation systems through their world knowledge and reasoning capabilities. However, existing approaches often rely on...

1

1

84

Information Retrieval Papers @CsIR_papers

Jun 11

Tail-Aware Adaptive-k: Query-Adaptive Context Selection for Retrieval-Augmented Generation Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia, Chuanpeng Wang arxiv.org/abs/2606.11907 [𝚌𝚜.𝙸𝚁] 💬Accepted at ECML PKDD 2026

Tail-Aware Adaptive-k: Query-Adaptive Context Selection for...

Adaptive context selection is critical for retrieval-augmented generation (RAG) systems, as fixed Top-K retrieval fails under query-dependent and heavy-tailed similarity distributions. While...

70

Information Retrieval Papers @CsIR_papers

Jun 11

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding Fuwei Zhang, Yanzhao Zhang, Mingxin Li, Dingkun Long, Lexiang Hu, Pengjun Xie, Zhao Zhang, Fuzhen Zhuang arxiv.org/abs/2606.11864 [𝚌𝚜.𝙸𝚁]

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the...

Code retrieval is becoming central to coding agents, but agentic coding requires more than matching a natural-language query to an isolated snippet. Given a user request, a coding agent needs to...

17

Information Retrieval Papers @CsIR_papers

Jun 11

What Limits Does Quantization Place on Dense Top-k Retrieval? A Theoretical Study Koki Okajima, Tsukasa Yoshida arxiv.org/abs/2606.11780 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙸𝚃]

What Limits Does Quantization Place on Dense Top-$k$ Retrieval? A...

We establish conditions for embedding a corpus of $N$ documents as $d$-dimensional vectors such that every $k$-subset $S \subseteq [N]$ is realizable as a result of top-$k$ retrieval by some query...

1

1

48

Information Retrieval Papers @CsIR_papers

Jun 11

FAST-MEL: A Fast, Accurate, and Storage Efficient Solution for Multimodal Entity Linking Derrien Thomas, Laurent Amsaleg, Pascale Sébillot arxiv.org/abs/2606.11749 [𝚌𝚜.𝙸𝚁]

FAST-MEL: A Fast, Accurate, and Storage Efficient Solution for...

Multimodal entity linking (MEL) is the task that consists of matching textual and visual mentions of entities in unstructured data to their corresponding entities in a knowledge base (KB). To be...

2

Information Retrieval Papers @CsIR_papers

Jun 11

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong, Yuxuan Zhang, Ping Nie, Rui Meng, Xiaoyu Shen arxiv.org/abs/2606.11700 [𝚌𝚜.𝙸𝚁]

CompRank: Efficient LLM Reranking via Token-Level Compression and...

Large language model (LLM) rerankers have become an important component of modern retrieval and retrieval-augmented generation pipelines, but their high computational cost limits their...

33

Information Retrieval Papers @CsIR_papers

Jun 11

The Long Tail, Not the Front Page: Cold-Start Prediction of Crowd Highlight Salience Kazuki Nakayashiki, Keisuke Watanabe arxiv.org/abs/2606.11654 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙷𝙲 𝚌𝚜.𝚂𝙸]

The Long Tail, Not the Front Page: Cold-Start Prediction of Crowd...

A social highlighter's most useful signal -- which passages a crowd of readers marks -- exists only for documents people have already read. Can the aggregate crowd salience of a document be...

1

Information Retrieval Papers @CsIR_papers

Jun 11

Factions Within, Uncertain Across: Within-Document Reader Sub-Groups in Social Highlighting Kazuki Nakayashiki, Keisuke Watanabe arxiv.org/abs/2606.11613 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙷𝙲 𝚌𝚜.𝚂𝙸]

Factions Within, Uncertain Across: Within-Document Reader...

When many people highlight the same document, is the crowd a single consensus, or is it internally structured into reader sub-groups that mark different things -- and is that structure a stable...

1

Information Retrieval Papers @CsIR_papers

Jun 11

A PubMed-Scale Dataset of Structured Biomedical Abstracts Chia-Hsuan Chang, Haerin Song, Brian Ondov, Hua Xu arxiv.org/abs/2606.11361 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙲𝙻] 💬Code: github.com/BIDS-Xu-Lab/Struc…,

A PubMed-Scale Dataset of Structured Biomedical Abstracts

Structured abstracts are important for biomedical literature processing, by facilitating information retrieval, text mining, and knowledge synthesis. However, a vast portion of abstracts indexed...

16

Information Retrieval Papers @CsIR_papers

Jun 11

Generative Archetype-Grounded Item Representations for Sequential Recommendation Yifan Li, Jiahong Liu, Xinni Zhang, Hao Chen, Yankai Chen, Wenhao Yu, Jianting Chen, Irwin King arxiv.org/abs/2606.11023 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙲𝙻 𝚌𝚜.𝙻𝙶] 💬Accepted by WWW 2026 (Oral)

Generative Archetype-Grounded Item Representations for Sequential...

Sequential recommendation aims to predict users' next interaction with items by analyzing their historical behavior. However, the limited quality of item representations remains a critical...

94

Information Retrieval Papers @CsIR_papers

Jun 11

miniReranker: Efficient Multimodal Reranking through Visual Cache Reuse and Interaction Sparsity Yingqi Fan, Xuan Lu, Anhao Zhao, Junlong Tong, Ping Nie, Kai Zou, Yunpu Ma, Wei Zhang, Xiaoyu Shen arxiv.org/abs/2606.10759 [𝚌𝚜.𝙸𝚁]

miniReranker: Efficient Multimodal Reranking through Visual Cache...

Multimodal large language models (MLLMs) have recently shown strong potential as point-wise rerankers by directly modeling query--document relevance through next-token prediction. However,...

31

Information Retrieval Papers @CsIR_papers

Jun 11

Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training João Coelho, João Magalhães, Bruno Martins, Chenyan Xiong arxiv.org/abs/2606.10709 [𝚌𝚜.𝙸𝚁 𝚌𝚜.𝙰𝙸]

Effective Reinforcement Learning for Agentic Search by Recycling...

The use of GRPO-style algorithms has become the standard strategy for training LLM search agents under outcome-only rewards. With these algorithms, a query contributes to parameter updates only...

9

Information Retrieval Papers @CsIR_papers

Jun 11

Beyond Patches: Superpixel Token-based Transformers for Attribute-Specific Fashion Retrieval Shuili Zhang, Hongzhang Mu, Wenyuan Zhang, Duohe Ma, Tingwen Liu arxiv.org/abs/2606.10697 [𝚌𝚜.𝙸𝚁]

Beyond Patches: Superpixel Token-based Transformers for...

Attribute-Specific Fashion Retrieval (ASFR) aims to improve fine-grained image retrieval by focusing on specific attributes. However, existing patch-based attention and Transformer methods often...

13