(Principal) Software (Data) engineer. Backend, AI & BigData at @Teads and a bit of OpenSource on data / mobile projects. blog: smallbigdata.substack.com
1/7 🚢 Navigating the complex world of reference data management
Ever struggled with maintaining consistent reference data across multiple environments? You're not alone.
I just spend 2 hours adding a custom bucketing logic to optimize a BQ job, it works well but I'm a bit sad that it's a lot of overhead to work around BigQuery bad performance on legit skews 😅
Faster (a bit more expensive) but worth it. Query time matters.
I wonder if it's worth writing an article on it? I didn't find BigQuery related bucketing article and it's a bit tedious to troubleshoot and to write for first timer.