This is why we prioritize ensuring that Expected Parrot automatically works with the latest models, and we make it super easy to check. These models know about Expected Parrot too, and how to convert your old survey from Qualtrics, SurveyMonkey, etc., into our intuitive open-source code:
Whoa. This new GDPval score is a very big deal.
Probably the most economically relevant measure of AI ability suggesting that in head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans