Such a fun project to work on with Lechen and Jiarui! TLDR: personalization is harder than most benchmarks indicate, mostly because people are people and models are not people. Check out Lechen's post and paper for more!
1/ When chatbots remember something about you in “personalized” answers, does it ever feel uncomfortable, offensive, or just unnecessary?
🚨 Our paper argues these misalignments reflects a deeper issue in personalization: over-reliance on synthetic users and LLM judges.
🧵👇