Hey, I said it in the original article, but I'll say it here again. This was me.
I actually had pretty bad data leakage. That's why I accidentally got 85% (oops).
I have another video where I explain what happened and how I tried to fix this (it's on yt).
a student took the ELO rating system from chess
ran it through 95,491 tennis matches over 43 years, and trained an XGBoost model that predicts winners with 85% accuracy
he tested it on the Australian Open 2025 completely outside the training data
99 out of 116 matches correct
called every single Sinner win through the entire tournament
the champion, before the first ball was hit
no team, no funding, a laptop and free CSVs from the internet
this is the best breakdown of a real sports prediction model I've seen
study it or feed it to your AI agent