How junk science persists in academia, news media, and social media: Resistance to the resistance

Statistical Modeling, Causal Inference, and Social Science 2022-04-22

In comments on our recent post, The so-called “lucky golf ball”: The Association for Psychological Science promotes junk science while ignoring the careful, serious work of replication, Jim asked why so many of these ridiculous and unreplicated results kept coming up in the field of psychology. I shared my hypothesis: one reason that psychology has all these crappy claims that stay around even after failed replication is that this sort of research does not have active opposition.

The crappy research in question was promoted by the APS recently, but it had been published in the society’s journal in 2010, back when junk science was a norm.

But, even back then, there were people within the field who were calling out the problems. Two prominent critics back then were Uri Simonsohn and Greg Francis, two psychology researchers who disagreed on the details but from my perspective were making similar points.

The good news is that Greg Francis showed up in a recent comment thread with this story, commenting on my remark that a lot of bad research in psychology stayed afloat because it did not have active opposition. Here’s Francis:

I would say there is active opposition to the (admittedly small) opposition. In 2012, I submitted a commentary to Psychological Sciences on the Damish et al. (2010) paper when I realized the results seemed “too good to be true”. The editor rejected the commentary based on the feedback from “a very disitnguished professor of experimental design and statistical methods” who wrote (among other things), “I would not be at all surprised if there is publication bias involved. If I had run a study on superstition and the results were null, I would not likely submit it for publication.”

On the face of it, this is kind of amazing: flat-out admitting the problem but not wanting to do anything about it! As Francis says, it’s opposition to opposition.

More generally, there are people in academia who take an anti-anti-junk science stand. They’re not exactly in favor of junk science—if you pressed them on it they would accept that open data is better than not, that non-replication tells us something, that accurate measurement is a good idea, etc.—but what really bugs them is when people are anti-junk science.

We’ve seen this over the years, with prominent academics dissing research critics as being “second-string,” “terrorists,” “Stasi,” etc etc etc.

Regarding lack of opposition, commenter Jack wrote:

Keep in mind that most of [Brian “Pizzagate”] Wansink’s papers are in obscure, minor journals that are rarely read and cited. He gets by on quantity, volume. Also . . . Wansink’s papers are usually on “harmless” topics, uncontroversial and very specific. It would be different if the papers were on “big topics”, topics that everyone has a strong opinion about, and on which many people work using the same data or similar data. For example if you use CRSP financial securities data and make a dubious claim about what explains expected returns, you can be sure 1000 researchers will call you out on it.

It’s like the difference between fishing in a small pond no one but you knows about, vs fishing in the Grand Banks in the Atlantic.

To which I replied:

Ahhhh, but here’s the paradox: In a scientific context, Wansink’s work is obscure. Yet in the news media, he was not obscure at all, being featured in the New York Times, the New Yorker, NPR, etc., as well as Marginal Revolution, Freakonomics, etc. And he was respected enough to have received millions of dollars of government dollars and was appointed to a post in the government. And he was a superstar at Cornell, a well-regarded university. His work had real policy impact. So not such a small pond at all.

So, lots of publicity and influence but not much opposition. Lots of people where happy to promote Wansink’s junk science within the academic fields of psychology and business management and within the news media, but not much resistance. And similarly with other purveyors of junk science.

P.S. Greg Francis in his comment adds:

I had not noticed at the time, but I later realized that the means reported in Experiment 4 of the Damish et al. paper fail a GRIM test. The measure of performance is the number of correctly identified words, so the sum of scores across participants of each condition must be an integer value. Damish et al. do not report their sample sizes for each of two conditions, just the total sample size (n1+n2=29). The reported mean for participants with the lucky charm (M1=45.84) and the mean for participants without their lucky charm (M2=30.56) cannot simultaneously be produced by any combination of n1 and n2 sample sizes that together add up to a total sample size of 29. For example, if n1=14, then the sum of ratings would be n1*M1=641.76, which presumably rounds up to 642 (it has to be an integer because it is a count of correctly identified words). But 642/14=45.857, which does not match the reported 45.84. Rounding down does not help either because 641/14=45.785, which does not match the reported mean. There is no way to get M1=45.84 from n1=14 participants. For other combinations of n1 and n2, you can get one of the means to make sense, but never both simultaneously.

I’m reminded of Clarke’s Law: Any sufficiently crappy research is indistinguishable from fraud. I don’t know if the numbers in the article in question were made up, or rounded and unrounded too many times, or mistyped, or maybe Francis messed up in his calculations—I’m guessing the most likely possibility is that the authors messed up in some small way in their analysis, including certain data in some comparisons but not others—but it really doesn’t matter, except for historical reasons, to help understand how things went so wrong for so long in that field.