This plot reminds me of my first IR work reproducing DPR in Pyserini, where we found BM25 is amazingly helpful when hybrid with a dense retriever. BM25 is never just a simple baseline -- used the right way, it can easily outperform many fancy methods.
BM25 was the most robust method shown in BEIR, the most effective and efficient method for long-context search shown in LongEmbed, and now
@mattjustram and
@xuzihuan4 show that BM25 can push the search agents into the best efficiency frontier.
p.s. Pyserini and pi-serini are two different repos.