Many public discussions center around trends and statistics that are not real at all.
For over a decade, there was widespread public discourse about the causes of high and rising maternal mortality in the US.
But, as I've written about before , CDC analyses showed that the apparent rise from 2003 to 2017 was due to a change in measurement
ourworldindata.org/rise-us-m… , when a pregnancy checkbox was added to death certificates, which flowed directly into maternal mortality counts in most cases. Rather than mortality rising, the rate had been stable. Many deaths had been previously missed, and many other countries were undercounting maternal deaths.
This isn't an isolated case.
- People often cite the IHME's estimate of childhood height having fallen in the UK over the past decade. Looking at the data sources, it missed one of the key sources of data on height - a national dataset measuring the height and weight of almost all schoolchildren in the UK, which showed no decline (that data wasn't publicly available until an FOIA request) - and instead the IHME estimates were likely extrapolated based on a global model and smaller, less reliable surveys.
neilobrien.co.uk/p/honey-we-…
- I often hear claims about disruptive science having declined over time based on a highly influential paper in Nature.
nature.com/articles/s41586-0… But the key results were affected by a coding bug, which would have showed a decline simply due to this artefact
arxiv.org/abs/2402.14583
- The idea that interstate migration in the US has collapsed has led to lots of concern about dynamism and unemployment. But recently, it's been shown that much of the apparent decline was a statistical artefact of how the survey filled in missing responses, causing it to systematically overcount non-movers. Correcting this shows only a very slight decline over time
link.springer.com/article/10…
- The dramatic rise in autism diagnoses, which has spurred lots of commentary about pesticide use and vaccines, actually reflects changes in how autism was defined. In the 1960s, autism described severely disabled, mostly nonverbal children: if a child was verbal or succeeding at school, they were excluded from the diagnosis by definition. The criteria then widened across successive editions of the DSM. Alongside it, it became much easier to get assessed, from requiring a specialist with months-long waiting lists to something that could be done in a few appointments.
pubmed.ncbi.nlm.nih.gov/2592…
--
I think this is a persistent problem of people undervaluing data quality and measurement. It may sound dull or academic to care about these issues, but numbers and statistics are a big part of public discussions. They can be the premise of debates that can go on for years and sometimes even decades, and mislead people about social and policy interventions to fix them.
So before spending time arguing about the causes and consequences of a trend or statistic and what should be done about it, it's worth digging into the data to see if it supports the premise at all.
I suspect there are many other discussions affected by this too. Are there others I've missed?