I will be talking about evaluating AI systems in my talk tomorrow for
@pydatalondon , "Death by RMSE: A Cautionary Tale of Metrics Gone Wild"
"Youβve trained the model. The eval metrics look great. But somehow, it doesn't change anything β the KPIs are static, the business impact isnβt there, and youβre left wondering: DID WE OPTIMISE FOR THE WRONG THING?"
meetup.com/pydata-london-meeβ¦
Quite pleased to see the emphasis on evaluating and not just delivering AI pilots, and comparing that to what *actually happens now* - we should be working to understand exactly where and how AI works in government, not whether or not it's perfect.