I'm delighted to have received the SIGIR Early Career Researcher Award!
Thanks to all my wonderful students, colleagues, and collaborators for their support and countless discussions about wild new ideas for the field.
x.com/ir_glasgow/status/1945…
Huge congratulations to @macavaney on receiving the prestigious ACM SIGIR Early Career Researcher Award in the research category! This well-deserved recognition highlights the excellence & impact of his work in the IR community 👏🎉#sigir2025
Cc @GlasgowCS@UofGlasgow@ACMSIGIR
🚨 Every major AI lab is racing to build better "deep research" agents — systems that search, synthesize, and report across the web.
But how do we actually *benchmark* them?
Introducing 🧵 TREC RAGTIME — the shared task for rigorous RAG evaluation.
trec-ragtime.github.io/
Delighted that our paper “PLAID-PRF — Pseudo-Relevance Feedback with Centroid-like Tokens in PLAID” has been accepted to #sigir2026, w/ Xiao Wang and @macavaney
The call for papers for CLPsych 2026 collocated with ACL 2026 is out and we have a shared task that is accepting applications!
We would love to learn more about your amazing work at the intersection of NLP and clinical psychology.
clpsych.org/call-for-papers/
Happy to see another research group, @haike_xu working in the same direction and our SlideGAR in the BRIGHT world. However, Reranker-Guided Search is not new. There are papers like Quam (WSDM'25), ORE (SIGIR'25), ReFIT(SIGIR'25), TOUR (ACL'23) that use the ranker's guidance.
🚨 New Pre-Print! You've just added your 600th model to your negative mining pool and filtered all false negatives. Does any of this even matter when we can apply distillation?
In this work with @debforit and @macavaney, we explore data selection in modern ranking. 🧵 Below
Disentangling Locality and Entropy in Ranking Distillation
@MrParryParry et al. separate example selection effects from teacher ranking entropy in neural ranking model optimization, showing complex hard-negative pipelines offer minimal gains.
📝arxiv.org/abs/2505.21058
As @hscells et al say: ♻️ Reduce, Reuse, Recycle!
It's never been easier to share indexes (Terrier, Anserini, Pisa, Dense, etc.) using HuggingFace, Zenodo, etc. 🤓
ALT Code example in Python:
```
import pyterrier as pt
# load a msmarco-passage index from huggingface
index = pt.Artifact.from_hf('macavaney/msmarco-passage.terrier')
# it's ready to use!
retriever = index.bm25()
retriever.search('my dear watson')
# qid query docno score rank
# 1 my dear watson 5341214 36.087756 0
# 1 my dear watson 2385137 30.109050 1
# ...
# share an index to HuggingFace, Zenodo, etc.
index = pt.Artifact.load('my-index')
index.to_hf('macavaney/my-index')
index.to_zenodo()
index.to_p2p()
```
Artifact Sharing for Information Retrieval Research
@macavaney introduces a flexible way to share artifacts like indices and models for Information Retrieval research, improving both accessibility and usability.
📝arxiv.org/abs/2505.05434
👨🏽💻github.com/seanmacavaney/art…
CLPsych 2025 @naaclmeeting is happening soon! We're looking forward to seeing you all. Stay tuned for the Best Poster Award, which will be voted on the day of the workshop after the poster session.
It was a really pleasant surprise to learn that our paper “Efficient Constant-Space Multi-Vector Retrieval” aka ConstBERT, co-authored with @macavaney and @ntonellotto received the Best Short Paper Honourable Mention at ECIR 2025!
#ECIR2025#IR#Pinecone
🚨 New Pre-Print!🚨 with @macavaney & @iadh. Stop using "translate-train" for all your multilingual needs. We explore zero-shot transfer for low-resource languages... 🧵
🚨 New Pre-Print! 🚨 Reviewer 2 has once again asked for DL’19, what can you say in rebuttal? We have re-annotated DL’19 in the form of classic evaluation stability studies. Work done with @maik_froebe, @hscells, @fschlatt1, @guglielm0f, @saber_zerhoudi, @macavaney, @EYangTW 🧵