Wednesday, January 12, 2011

A Cup of Statistics

Risk Analyst Hoodie As this hoodie (available on Cafe Press) suggests, we should always be aware of risk. Relative statistics do little to help us evaluate the risks we take.

Here is the text of the speech I delivered today at Toastmasters. It is a bit of a ramble because I procrastinated and didn’t have time to apply sound instructional design to it. I also tried a physical ending, which confused at least one member of my audience.

The speech ran a little long, so I’ve marked some of the text for removal if I ever deliver it again.

Mark Twain may have said it best. “There are three types of liars: liars, damned liars, and statisticians.” In his 1952 classic, Elementary Statistical Analysis, Harry Hartkemeier says Twain implies that statisticians “have reached the superlative.” There is no better liar than a statistician.

But as much as we distrust statisticians, we all love to bandy numbers to support our points. I’ve heard that as much as 67.8% of all statistics are made up on the spot.

We love numbers, but we believe them only when they support our views. Why is that?

Have you ever deliberately falsified the numbers to support your claims?

Do you know someone who has?

Yes, we’ve all read about sloppy scientists who have [falsified the data]. We may know someone who has. We may even have done so occasionally ourselves, as with my reference to the percentage of impromptu statistics. Actually, less than half of all statistics are made up.

So why don’t we trust statistics?

  • They don’t seem to yield any useful information.
  • They are often contradictory.
  • They’re almost always confusing.
And speaking of statistics made up on the spot… Photo source: Dirt & Seeds

Like many on the right, I blame the liberal media. And like many on the left, I blame Fox News.

The media has become fixated on numbers. Statistics make sloppy reporting sound more credible. And in the drive for audience, they focus on one type of statistics above all others—relative statistics.

Now you won’t be able to find a textbook on relative statistics. It isn’t a real branch of the actual science. That comprises:

  • Descriptive statistics, which summarizes data by describing what was observed in a sample
  • Inferential statistics, which uses patterns in the sample data to draw inferences about the data

Inferential statistics is most often used in scientific research looking for a correlation between two variables.

And that’s where we begin to get into trouble. The first problem is that people unfamiliar with the scientific method confuse correlation and cause even though the first thing you lean in Intro to Statistics is that correlation does not equate to cause. Correlation does not equate to cause.

Early in the century, two separately reported studies told us that people who drink more than four cups of coffee every day have:

  • A 40% increased risk of colon cancer
  • A 40% decreased risk of heart disease

Sounds like a wash, right?

But increased and decreased risk are relative statistics. Even saying that something increases your risk of dying by 100% doesn’t give you enough information to make an informed decision.

  • If choosing one behavior over another increases your risk from 1/1B to 2/1B, that is a 100% increase—but not much of a risk.
  • If it increases your risk from 1/1000 to 1/500, it’s still a 100% increase. And it poses a much more immediate risk.
  • But even in the second case choose one behavior or the other won’t ensure you to live or die.

Another way relative statistics causes problems is that the media may only report one side of the equation. reports on a study in Archives of Pediatrics & Adolescent Medicine saying that women who take antibiotics during pregnancy are significantly more likely to deliver babies with birth defects than those who don’t.

  • Mothers of children with a fatal skull and brain malformation were three times more likely to have taken a sulfa drug than women whose children did not have the defect.
  • Kids with cleft lips or palates were twice when mothers had taken nitrofurantoins, which were also linked to congenital heart defects, eye defects, and being born missing one or both eyes.
  • Penicillin was associated with a higher risk of a kind of limb malformation.

The report fails to mention what these women were being treated for. We have no information about whether the conditions the women had could have similarly damage or even killed the fetuses. We do know that if the disease had killed the mother, the baby would not have had a birth defect.

The media loves to report relative risk because phrases like “100% increased risk” or “three times more likely” or “half as likely” get your heart going. They scare you. They make you want to pay attention to what the reporter says.

And about coffee: it turns out coffee may reduce the risk of colon cancer by speeding the passage of carcinogens through your gut. Web MD says the decreased risk of heart disease ranges between 19%–91%—no descriptive statistics there.

But it does explain the mechanism of the reduction: antioxidants and anti—inflammatory compounds in the brew.

So the next time you hear about relative risk, know that the information is probably reliable but useless. Hearing the numbers should start your research, not make up your mind. It’s up to you to find out what the numbers really mean.

Oh, and now they say coffee also reduces your risk of Type 2 Diabetes. Who cares if the statistics are only relative? I’m going for another cup!

Not one of my best efforts. Maybe I’d give it a B.