Webis Group

Webis Group

133 Photos and videos

Tweets

Webis Group @webis_de

27 Oct 2025

We just released "German Commons", the largest openly-licensed German text dataset for LLM training: 154B tokens with clear usage rights for research and commercial use. huggingface.co/datasets/cora…

coral-nlp/german-commons · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

19,331

more replies

Webis Group

Webis Group @webis_de

27 Oct 2025

The data spans 7 text domains: 🌐 Web: Wikis, GitHub, social media 💬 Political: Parliamentary proc., speeches ⚖️ Legal: Court decisions, law 📰 News: Newspaper archives 🏦 Economics: Public tenders 📚 Cultural: Heritage collections 🔬 Scientific: Papers, books, journals

286

Webis Group

Webis Group @webis_de

27 Oct 2025

For full technical details compliance Datasheet see our preprint @ arxiv.org/abs/2510.13996 As for German-specific models trained on this data... stay tuned 👀

226

Webis Group

Webis Group @webis_de

18 Jul 2025

Come join us at the poster session at ICTIR 2025 to discuss: - Axioms for Retrieval-Augmented Generation webis.de/publications.html#m… - Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins webis.de/publications.html#g…

472

Webis Group

Webis Group @webis_de

18 Jul 2025

Honored to win the ICTIR Best Paper Honorable Mention Award for "Axioms for Retrieval-Augmented Generation"! Our new axioms are integrated with ir_axioms: github.com/webis-de/ir_axiom… Nice to see axiomatic IR gaining momentum.

608

Webis Group

Webis Group @webis_de

18 Jul 2025

Congratulations to the authors @H1iReimer, @maik_froebe, @bennostein, @martinpotthast, @matthias_hagen from @UniJena, @bauhaus_uni, @uni_kassel, @Hessian_AI, @Sca_DS!

182

Webis Group

Webis Group @webis_de

18 Jul 2025

Thrilled to announce that @MattiWiegmann has successfully defended his PhD! 🎉🧑‍🎓 Huge congratulations on this incredible achievement! #PhDDefense #AcademicMilestone

175

Webis Group

Webis Group @webis_de

16 Jul 2025

Happy to share that our paper "The Viability of Crowdsourcing for RAG Evaluation" received the Best Paper Honourable Mention at #SIGIR2025! Very grateful to the community for recognizing our work on improving RAG evaluation. 📄 webis.de/publications.html#g…

573

Webis Group

Webis Group @webis_de

16 Jul 2025

Congrats to the authors @LukasGienapp, Tim Hagen, @maik_froebe, @matthias_hagen @bennostein, @martinpotthast and @hscells – from @uni_kassel, @Hessian_AI, @Sca_DS, @uni_tue, @UniJena & @bauhaus_uni

131

Maik Fröbe

Webis Group retweeted

Maik Fröbe @maik_froebe

27 Jun 2025

Do not forget to participate in the #TREC2025 Tip-of-the-Tongue (ToT) Track :) The corpus and baselines (with run files) are now available and easily accessible via the ir_datasets API and the HuggingFace Datasets API. More details are available at: trec-tot.github.io/guideline…

599

Webis Group

Webis Group @webis_de

22 Jun 2025

Our paper on self-distillation for training bi-encoders got accepted at #ICTIR2025! By exploiting pretrained encoder capabilities, our approach eliminates expensive teacher models and batch sampling while maintaining the same effectiveness.

283

more replies

Webis Group

Webis Group @webis_de

22 Jun 2025

Results on BEIR demonstrate that our method matches teacher distillation effectiveness, while using only 13.5% of the data and achieving 3-15x training speedup. This makes effective bi-encoder training more accessible, especially for low-resource settings.

Webis Group

Webis Group @webis_de

22 Jun 2025

Credit & thanks to the author team @LukasGienapp @DeckersNiklas @martinpotthast @hscells 📄 Preprint: arxiv.org/abs/2407.21515 💻 Code: github.com/webis-de/adaptive…

Learning Effective Representations for Retrieval Using...

Representation-based retrieval models, so-called bi-encoders, estimate the relevance of a document to a query by calculating the similarity of their respective embeddings. Current state-of-the-art...

arxiv.org

Ferdinand Schlatt

Webis Group retweeted

Ferdinand Schlatt @fschlatt1

9 Apr 2025

Replying to @maik_froebe @hscells @ShengyaoZhuang @bevan_koopman @guidozuc @bennostein @martinpotthast @matthias_hagen

Short: Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking webis.de/publications.html#s… Full: Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders webis.de/publications.html#s…

976

Ferdinand Schlatt

Webis Group retweeted

Ferdinand Schlatt @fschlatt1

9 Apr 2025

What an honor to receive both the best short paper award and the best paper honourable mention award at #ECIR2025. Thank you to all the co-authors @maik_froebe @hscells @ShengyaoZhuang @bevan_koopman @guidozuc @bennostein @martinpotthast @matthias_hagen 🥳

1,284

Lorella Viola

Webis Group retweeted

Lorella Viola @ViolaLorella

8 Apr 2025

#ECIR2025 crowd day#2 😍 Cultural #LunchBreak #Campanile #sanfrediano #lucca

404

Webis Group

Webis Group @webis_de

7 Apr 2025

📢 Our paper "The Viability of Crowdsourcing for RAG Evaluation" has been accepted to #SIGIR2025 ! We compared how good humans and LLMs are at writing and judging RAG responses, assembling 1800 responses across 3 styles, and 47K pairwise judgments in 7 quality dimensions. 🧵➡️

534

more replies

Webis Group

Webis Group @webis_de

7 Apr 2025

🧵 3/4 This fundamentally challenges previous assumptions about RAG evaluation and system design. But we also show how crowdsourcing offers a viable and scalable alternative! Check out the paper for more. 📝 Preprint @ downloads.webis.de/publicati…⚙️Code/Data is openly available.

183

Webis Group

Webis Group @webis_de

7 Apr 2025

🧵 4/4 Credit and thanks to the author team @LukasGienapp, Tim Hagen, @maik_froebe, @matthias_hagen, @bennostein, @martinpotthast, and @hscells – you can also catch some of them at #ECIR2025 currently if you want to chat about RAG!

192