🥉 “Say No to Mass Destruction: Benchmarking Refusals to Answer Dangerous Questions” by Alex Pino, Carl Vinas, JD Dantes, Zmavli Caimle, and Kyle Reynoso won 3rd place in Apart’s AI Security Evals Hackathon. It showed how some models would presume high-risk questions as "safe."