One of my side projects this month has been making an AI benchmark to test model alignment in various ways.
I landed on Deviance, War, & a modified version of the classic political compass.
I've decided to call the whole benchmark Polibench. Check it out & take the test yourself
ALT https://polibench.jonathanrreed.com/