Thanks so much to all the contributors! We honestly didn't expect this many submissions, and even better there are now new benchmarks as goal posts for computational models. See you at CCN in August :)
It's a wrap! Across 46 new benchmarks from 8 submitters, we have now evaluated 19 reference models for a total of 874 alignment scores. These new behavioral and neural benchmarks are showing the short-comings of our models (although some occasionally prevail). Winners TBA at CCN!