What is the relevance of “bad science” to our understanding of “good science”?
Statistical Modeling, Causal Inference, and Social Science 2021-02-23
We spend some time talking about junk science, or possible junk science, most recently that book about sleep, but we have lots of other examples such as himmicanes, air rage, ages ending in 9, pizzagate, weggy, the disgraced primatologist, regression discontinuity disasters, beauty and sex ratio, the critical positivity ratio, slaves and serfs, gremlins, and lots more examples that I don’t happen to recall at this moment.
Why do I keep writing about this? Why do we care?
Here’s a quick reminder of several reasons that we care:
1. Some junk science is consequential. For example, Weggy was advising a congressional committee when he was making stuff up about research, and it seems that the Gremlins dude is, as the saying goes, active in the environmental economics movement.
2. The crowd-out, or Gresham, effect. Junk science appears in journals, careful science doesn’t. Junk science appears in PNAS, gets promoted by science celebrities and science journalists. The prominent path to success offered by junk science motivates young scientists to pursue work in that direction. Etc. There must be lots of people doing junk science who think they’re doing the good stuff, who follow all ethical principles and avoid so-called questionable research practices, but are still doing nothing in their empirical work but finding patterns in noise. Remember, honesty and transparency are not enough.
3. There’s no sharp dividing line between junk science and careful science, or between junk scientists and careful scientists. Some researchers such as Kanazawa and Wansink are purists and only seem to do junk science (which in their case is open-ended theory plus noisy experiments with inconclusive results), other people have mixed careers, and others of us try our best to do careful science but still can fall prey to errors of statistics and data collection. Recall the 50 shades of gray.
In short, we care about junk science because of its own malign consequence, because people are doing junk science without even realizing it—people who think that if they increase N and don’t “p-hack,” they’re doing things right—and because even those of us who are aware of the perils of junk science can still mess up.
A Ted talkin’ sleep researcher misrepresenting the literature or just plain making things up; a controversial sociologist drawing sexist conclusions from surveys of N=3000 where N=300,000 would be needed; a disgraced primatologist who wouldn’t share his data; a celebrity researcher in eating behavior who published purportedly empirical papers corresponding to no possible empirical data; an Excel error that may have influenced national economic policy; an iffy study that claimed to find that North Korea was more democratic than North Carolina; a claim, unsupported by data, that subliminal smiley faces could massively shift attitudes on immigration; various noise-shuffling statistical methods that just won’t go away—all of these, and more, represent different extremes of junk science.
None of us do all these things, and many of us try to do none of these things—but I think that most of us do some of these things much of the time. We’re sloppy with the literature, making claims that support our stories without checking carefully; we draw premature conclusions from so-called statistically significant patterns in noisy data; we keep sloppy workflows and can’t reconstruct our analyses; we process data without being clear on what’s being measured; we draw conclusions from elaborate models that we don’t fully understand.
The lesson to take away from extreme cases of scientific and scholarly mispractice is not, “Hey, these dudes are horrible. Me and my friends aren’t like that!”, but rather, “Hey, these are extreme versions of things that me and my friends might do. So let’s look more carefully at our own practices!”
P.S. Above cat picture courtesy of Zad Chow.