7 ways to separate errors from statistics

Statistical Modeling, Causal Inference, and Social Science 2013-05-02

sharing

Betsey Stevenson and Justin Wolfers have been inspired by the recent Reinhardt and Rogoff debacle to list “six ways to separate lies from statistics” in economics research:

1. “Focus on how robust a finding is, meaning that different ways of looking at the evidence point to the same conclusion.”

2. Don’t confuse statistical with practical significance.

3. “Be wary of scholars using high-powered statistical techniques as a bludgeon to silence critics who are not specialists.”

4. “Don’t fall into the trap of thinking about an empirical finding as ‘right’ or ‘wrong.’ At best, data provide an imperfect guide.”

5. “Don’t mistake correlation for causation.”

6. “Always ask ‘so what?’”

I like all these points, especially #4, which I think doesn’t get said enough. As I wrote a few months ago, high-profile social science research aims for proof, not for understanding—and that’s a problem.

My addition to the list

If you compare my title above to that of Stevenson and Wolfers, you’ll find two differences. First, I changed “lies” to “errors.” I have no idea who’s lying, and I’m much more comfortable talking about errors. Second, I think they missed an even better, more general way to find mistakes:

7. Make your data and analysis public.

This is the best approach, because now you can have lots of strangers checking your work for free! This advice is also particularly appropriate for Reinhardt and Rogoff because, according to various reports (see here and here), it was years before they made their data available to outsiders. Nearly three years ago (!), Dean Baker wrote a column entitled, “It Would Be Helpful if Rogoff and Reinhart Made Their Data Available.”

Perhaps “the risk of forced disclosure” (as Keith O’Rourke puts it) will motivate researchers to be more careful in the future.

Your additions?

I told Wolfers I was going to link to his list and add my own #7. He replied that we’re probably missing #8, 9, and 10. In the comments, feel free to add your favorite ways to separate errors from statistics. Phil already gave some here.