Data Lad (@DataLad87) | Aguea

Photos and videos

Tweets

Data Lad

@DataLad87

3h

Before you analyze anything, you have to get the data in, and it never arrives in the format you'd choose. New article on importing data in Python: CSV and Excel, pickle, SAS, HDF5, MATLAB, SQL databases, plus pulling data from the web with requests, BeautifulSoup, and JSON APIs. #Python #DataScience #pandas datalad.co.uk/importing-data…

Importing Data in Python

A practical guide to importing data in Python with pandas and NumPy: flat files, Excel, databases, APIs, web scraping, and reading JSON from any source.

4

Data Lad

@DataLad87

20h

Cookie loss, ITP, and consent refusals quietly undercount your conversions. Google's bidding algorithms then make worse decisions on incomplete data. New article on #enhancedconversions in GTM: how hashed first-party data recovers lost conversions, fed cleanly through the dataLayer and gated behind consent. #GoogleAds #GTM #GA4 datalad.co.uk/enhanced-conve…

Enhanced Conversions in GTM: Recovering Lost Conversions with First-Party Data and the dataLayer

Recover lost conversions with enhanced conversions in GTM: how hashed first-party data works, the dataLayer setup, and why it beats scraping the page.

8

Data Lad

@DataLad87

23h

Most teaching datasets are too clean, so the hardest part of the job never gets practised. I wrote up how I built a simulated churn dataset: planted duplicates, three kinds of missing data, dirty country labels, and a leakage trap that fakes a 0.90 AUC. You can download it for free here and practice on it. #DataScience #MachineLearning #Python datalad.co.uk/inside-the-bri…

Inside the Brightcart Club Dataset: How I Built a Deliberately Messy Churn File

A look inside the simulated Brightcart Club churn dataset: why every imperfection, from MNAR missing data to the leakage trap, was planted on purpose.

1

13

Data Lad

@DataLad87

Jun 13

Iframe checkout? Your purchase events are landing in a sealed room your GTM can't see into. New article on fixing it with window.postMessage: send events from the iframe, validate the origin on the parent, and fire clean GA4 ecommerce events. #GA4 #GoogleTagManager #Analytics datalad.co.uk/iframe-trackin…

Iframe Tracking with GTM: Pushing Checkout Data to the Parent dataLayer

Learn how to track an iframed checkout in GTM using window.postMessage: send events from the iframe, validate them on the parent, and fire GA4 events.

19

Data Lad

@DataLad87

Jun 13

Most analytics can tell you a sale happened. Enhanced ecommerce tells you the story behind it: what got viewed, clicked, added, abandoned, and finally bought. New guide on implementing GA4 ecommerce tracking in GTM: the dataLayer contract, standard events, and testing the full funnel. #GA4 #GoogleTagManager #Ecommerce datalad.co.uk/gtm-enhanced-e…

GTM Enhanced Ecommerce Implementation for GA4

Learn how to implement GA4 ecommerce tracking with GTM: the dataLayer structure, purchase and add_to_cart events, Custom Event triggers, and testing the full funnel.

15

Data Lad

@DataLad87

Jun 12

If you're learning data science and want a project that goes beyond "fit a model on clean data," I built a full churn prediction code-along: deliberately messy dataset, a hidden leakage trap, three missingness mechanisms, and a logistic regression that beats a random forest. Everything is explained line by line, and the notebook plus data are free to download. The fun part: the "obvious" best feature is a trap, and spotting why is half the lesson. Happy to answer questions if anyone works through it. datalad.co.uk/churn-a-comple…

Code Along: A Complete Churn Analysis in One Notebook

Explore data science with Datalad's Code Along! Learn to predict customer churn in an e-commerce setting through hands-on analysis.

48

Data Lad

@DataLad87

Jun 12

"Can you add our tags to your site?" doesn't have to be a risk conversation. New article on GTM Zones: link partner containers, scope them with URL conditions, whitelist exactly what can fire, and audit the rest. Tag governance done properly. #GTM #GoogleTagManager #MarTech datalad.co.uk/gtm-zones-mana…

GTM Zones: Managing Tags Across Teams, Partners and Multiple Sites

Learn how GTM Zones work in Tag Manager 360: linking containers, setting URL and hostname conditions, whitelisting partner tags, and auditing zone security.

20

Data Lad

@DataLad87

Jun 12

A decade on, XGBoost is still the king of tabular data. New practical guide: fit and predict, DMatrix, cross-validation with early stopping, hyperparameter tuning, and building sklearn pipelines that don't leak. #XGBoost #MachineLearning #Python datalad.co.uk/xgboost-a-prac…

XGBoost: A Practical Guide to Extreme Gradient Boosting

XGBoost step by step: boosting rounds, early stopping, eta and max_depth tuning, encoding categorical data, and building honest cross-validated pipelines.

55

Data Lad

@DataLad87

Jun 12

A decade on, XGBoost is still the king of tabular data. New practical guide: fit and predict, DMatrix, cross-validation with early stopping, hyperparameter tuning, and building sklearn pipelines that don't leak. #XGBoost #MachineLearning #Python datalad.co.uk/xgboost-a-prac…

XGBoost: A Practical Guide to Extreme Gradient Boosting

XGBoost step by step: boosting rounds, early stopping, eta and max_depth tuning, encoding categorical data, and building honest cross-validated pipelines.

38

Data Lad

@DataLad87

Jun 12

A class is a cookie cutter. Instances are the cookies. Once that clicks, Python OOP stops being intimidating. New article covering classes, self, init, inheritance with super(), dunder methods, and custom exceptions that fail fast. #Python #OOP #100DaysOfCode datalad.co.uk/object-oriente…

Object-Oriented Programming in Python

Learn Python OOP from the ground up: classes, init, inheritance, class methods, dunder methods like eq and repr, and custom exceptions with examples.

1

14

Data Lad

@DataLad87

Jun 12

The difference between [] and () in Python can be the difference between a script that streams 100 GB on a laptop and one that crashes. New article on iterators, comprehensions, and generators: enumerate, zip, yield, and reading files too big for memory in chunks. #Python #DataScience datalad.co.uk/python-iterato…

Python Iterators, Comprehensions and Generators

Master Python's lazy tools: iterators, enumerate and zip, comprehensions, generator functions with yield, and pandas chunksize for streaming files larger than RAM.

12

Data Lad

@DataLad87

Jun 11

Most tutorials hand you clean data. This one doesn't. A complete churn analysis in one notebook: messy labels, three kinds of missing data, a leakage trap that fakes 0.90 AUC, and a twist: logistic regression beats the random forest. #DataScience #Python #MachineLearning Free notebook dataset: datalad.co.uk/churn-a-comple…

Code Along: A Complete Churn Analysis in One Notebook

Explore data science with Datalad's Code Along! Learn to predict customer churn in an e-commerce setting through hands-on analysis.

16

Data Lad

@DataLad87

Jun 11

Run a large language model on your own laptop. No API keys, no per-token costs, full data privacy. New article on Llama 3 with llama-cpp-python: decoding parameters, prompt engineering, guaranteed-valid JSON output, and building a chatbot that remembers the conversation. #Llama3 #LLM #Python datalad.co.uk/working-with-l…

Working with Llama 3

Explore local inference with Llama 3 on your own laptop for privacy, cost savings, and complete control over data processing.

30

Data Lad

@DataLad87

Jun 10

Text, images, audio, and video in one workflow. New article on multi-modal models with Hugging Face: zero-shot classification with CLIP, voice conversion, ControlNet image editing, video generation, and scoring it all with CLIP score. #HuggingFace #AI #MachineLearning datalad.co.uk/multi-modal-mo…

Multi-Modal Models with Hugging Face

Explore multi-modal machine learning with Hugging Face: from models to pipelines handling text, images, audio, and video seamlessly.

42

Data Lad

@DataLad87

Jun 9

State-of-the-art language models in 3 lines of Python. New article covers the pipeline API, fine-tuning with the Trainer, and every evaluation metric you need: BLEU, ROUGE, perplexity, exact match, toxicity, and more. #LLM #AI #Python datalad.co.uk/introduction-t…

Introduction to LLMs in Python

Discover what makes a language model 'large' and learn how to use Hugging Face's Transformers for NLP tasks like summarization and translation.

41

Data Lad

@DataLad87

Jun 9

#HuggingFace puts state-of-the-art #AI into 3 lines of #Python. New article: run text classification, zero-shot labeling, summarization, and document QA using pipeline() and the transformers library. datalad.co.uk/working-with-h…

2

69

Data Lad

@DataLad87

Jun 8

A #pointestimate tells you where the parameter likely is. A #confidenceinterval tells you how much to trust that answer. Full coverage of single means, proportions, two-sample comparisons, paired samples, and sample size planning. datalad.co.uk/point-and-inte…

Point and Interval Estimation

Discover the importance of confidence intervals in statistics. Learn how they enhance point estimates by measuring uncertainty effectively.

19

Data Lad

@DataLad87

Jun 8

The #normaldistribution is not just a bell curve — it is the foundation of how #statistics reasons about populations, samples, and uncertainty. From expected values to the central limit theorem, a full walkthrough of the concepts that underpin statistical inference. datalad.co.uk/the-normal-dis…

The Normal Distribution

Explore the fundamentals of random variables, population mean, variance, normal distribution, and the central limit theorem in statistics.

15

Data Lad

@DataLad87

Jun 7

Full #datascience #roadmap with updated lessons here: datalad.co.uk/data-scientist… Follow me for more.

Data Scientist – Roadmap

Discover a comprehensive roadmap to becoming a data scientist, mastering mathematics, coding, and machine learning skills along the way.

15

Data Lad

@DataLad87

Jun 7

#Surveys ask. Observation watches. Knowing when to use which, and which survey format fits your #research context, is what separates a well-designed study from an expensive guess. Full breakdown of methods, criteria, and trade-offs: datalad.co.uk/survey-and-qua…

Survey and Quantitative Observation Techniques in Market Research

Discover the nuances of descriptive research design, survey methods, and observational techniques to better understand consumer behaviour.

20