pwn2own has always been a great datapoint for how hard it is to find vulns, what exploit mitigations are working, new exploitation techniques, and now how AI works on offensive security.
this year's pwn2own isn't just interesting because there will be lots of entries with AI human.
it is also interesting because
a) anthropic burned a ton of tokens on firefox, basically running claude in a loop until it found something for a month, probably exhausting whatever claude can one shot.
b) if someone submits full chain without much use of ai, it tells you one shotting plateaus and these models are bit like fuzzers than seasoned security reseachers.
c) even if they used an llm to find the bug, this tells us scaffolding/harnesss design, prompting, and the operator matters a lot.