The Noise in my Head

Trying to find the signal. Since 1960.

Lies and Statistics July 16, 2008

Filed under: Random Thoughts, Science — mfmosman @ 8:40 am
Tags:

It’s hard to imagine a blog post that could be less interesting to most of those who know me than a post about statistics.  The old joke goes: a statistician is someone who is interested in numbers, but doesn’t have the personality to become an accountant.  But, let us admit, I am fundamentally and deeply a geek; and whether or not statistics interest you, they interest me.  And this blog is a collection of things that currently interest me, so there you are.

One thing in particular interests me about statistics: how the fact that almost everyone is bad at statistics can contribute to real misunderstanding, even producing tragic results.  I listened to a talk recently by Oxford mathemetician Peter Donnelly that sensitized me to the topic, so I thought I’d share.

Take this example from Donnelly’s speech: Let’s say that there is a fairly rare disease (we’ll call it Mosman’s disease, which would obviously be a disease in which the sufferer is suddenly afflicted with a deep and abiding oddness).  The good news is: we have a test for the disease, and it’s a good one.  It is right 99% of the time.  A person you know just took the test, and it came back positive — it indicates that they have the disease.  What are the chances that the person actually has the disease?

99%, right?

Nope.  The answer, really, is “it depends.”  And what it depends on, is: how rare is the disease?

Play out the example: let’s say the Mosman Disease afflicts one in every 10,000 people.  In a population of a million people, this means that 100 people actually have it.  If everyone in a population of a million people take the test, the test will produce a positive result for 99 of those 100.

But that’s only half the story, isn’t it?  What’s really interesting here is what else happens: In addition to the 100 people who have the disease, we tested another 999,900 people – and the test would get it wrong one percent of the time on that remaining population.  So, one percent of 999,900 people could get a positive result — and not have the disease at all.  And because 999,900 is a lot of people, it turns out that these “false positives” can matter a lot.  In case you can’t do the math in your head, 9,999 people out of the million we started with will test positive for the disease without having it at all (one percent of 999,900 = 9,999).

In total, then, we’d have 10,098 people with a positive result, but only 99 of them actually have the disease. This suggests that getting a positive result on the test only means that you have a 0.98% chance of actually having the disease, which is still much higher than the general population’s 0.01% chance of having it, but I think we’d agree that it’s not very likely.

This kind of statistical/logical error that almost all of us make, all the time, can have tragic consequences in courts of law (apropos to my family): Donnelly points to a woman in Britain who was convicted of killing her children, who both died of SIDS, when a pediatrician suggested: (a) that the likelihood of having two children die of SIDS in the same household was 1 in 73 million; and (b) that this meant that the likelihood of the woman being innocent was 1 in 73 million.  Neither one was statistically accurate; in fact, both were horribly wrong.  Only after a newspaperman with a little background in statistics challenged the pediatrician’s math, did the woman get off (after spending years in prison).

I’m just sayin’ this: that all of my lawyer/judge family members should review their math notes from college.