Excited to share that our paper "SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests" has been accepted at ICLR 2026 🎉.
In this work, we introduce the first adversarial evaluation benchmark specifically designed to probe sociopolitical risks in LLMs.