The Error-Reversal Heuristic: How would you have reacted had the mistake gone in the opposite direction?

Statistical Modeling, Causal Inference, and Social Science 2025-04-22

Something we’ve seen with depressing regularity is that researchers do something sloppy—perhaps even deceitful or fraudulent, but oftentimes just sloppy—and then when the error is pointed out, they reply that the main conclusions of the study have not changed.

It often looks ridiculous, and when we post on these things we put them in the Zombies category—but, sometimes, sure it must be the case. Someone claims some big result, but it still seems to make sense that they got the direction right, so maybe the magnitude doesn’t matter?

How to think about this?

My suggestion: try the Error-Reversal Heuristic. Imagine how the promoter of the idea would’ve reacted had the mistaken gone in the opposite direction.

Here are some examples.

1. Published paper from an organization called Toxic-Free Future claims that a toxin is at 80% of the legal limit. They screwed up their calculation—it’s actually only 8%—and here’s their response: “it is important to note that this does not impact our results . . . and our recommendations remain the same.”

The Error-Reversal Heuristic: Suppose someone else had done a study and found that the level of exposure was “8% of the reference dose, thus, a potential concern,” but they’d done the calculation wrong, and the level was really 80% of the reference dose. Then I assume that the folks at Toxic-Free Future would’t say that the recommendations remain the same, right? They’d say the exposure had been underestimated by a factor of 10 and that’s a big deal!

2. Published paper in Lancet (uh oh) published a paper that hydroxychloroquine/chloroquine was killing people. It turns out the work was fraudulent, which perhaps should not surprise us, given the strong criticism by James “not the racist dude” Watson, who wrote at the time, “The big finding is that when controlling for age, sex, race, co-morbidities and disease severity, the mortality is double in the HCQ/CQ groups (16-24% versus 9% in controls). This is a huge effect size! Not many drugs are that good at killing people. . . . The most obvious confounder is disease severity . . . The authors say that they adjust for disease severity but actually they use just two binary variables: oxygen saturation and qSOFA score. The second one has actually been reported to be quite bad for stratifying disease severity in COVID. The biggest problem is that they include patients who received HCQ/CQ treatment up to 48 hours post admission. . . . This temporal aspect cannot be picked up a single severity measurement. In short, seeing such huge effects really suggests that some very big confounders have not been properly adjusted for. . . .”

Five days after the problems with this paper came out, a press officer for Lancet wrote that “The results and conclusions reported in the study remain unchanged.”

Ummm . . . time for the Error-Reversal Heuristic: Suppose the results had originally been reported as kinda small, then it turned out a mistake had been made, and the actual effect of the drug was to double the mortality rate. How would the promoters have reacted? I’m pretty sure they’d say that such an effect is a big deal!

3. A published paper, “Attractive Names Sustain Increased Vegetable Intake in Schools” (guess who’s the author? Hint: “Pizzagate”) made big claims. It turned out that the data in the paper were incoherent, and a correction was written that was longer than the original paper. According to Retraction Watch: “Some of the changes include explaining the children studied were preschoolers (3-5 years old), not preteens (8-11), as originally claimed.” The author’s response to all of this? You got it: “These mistakes and omissions do not change the general conclusion of the paper.”

Time for the Error-Reversal Heuristic! What if things had gone the other way? Someone published a null result on the effects of attractive names on vegetable intake in schools, but it turned out that the data had been entirely garbled, and in fact the study was on preschoolers, not preteens. Would Mister Cornell Food Researcher then reply that the general conclusions did not change? Hell no! He would’ve said this any claims of a null finding were invalidated by the sloppiness of the study.

4. A notorious member of the National Academy of Sciences published a paper with t-statistics reported as 5.03 and 11.14. But those were in error! The actual t-statistics were 1.8 and 3.3. How did the author reply? You’ll never guess: this “does not change the conclusion of the paper.” As I wrote at the time:

This is both ridiculous and all too true. It’s ridiculous because one of the key claims is entirely based on a statistically significant p-value that is no longer there. But the claim is true because the real “conclusion of the paper” doesn’t depend on any of its details—all that matters is that there’s something, somewhere, that has p less than .05, because that’s enough to make publishable, promotable claims about “the pervasiveness and persistence of the elderly stereotype” or whatever else they want to publish that day.

When the authors protest that none of the errors really matter, it makes you realize that, in these projects, the data hardly matter at all.

But . . . let’s try the Error-Reversal Heuristic. Suppose the published t statistics had been 1.8 and 3.3, but that had been an error, and they really were 5.03 and 11.14. How would the author have responded then? Probably something about how strong the evidence is, right?

5. We’ve also seen papers where the result goes in the opposite direction of the pre-registration. Had it gone in the same direction as the pre-registration, it would be hailed as a success, so when it goes in the opposite direction . . . maybe not so much of a success? There was the notorious case of the paper about ovulation and clothing with a finding that failed to replicate in a new study by the same authors. They refused to let go of the original, fatally-flawed claim and instead argued that they’d discovered an interaction. And then there was the “gremlins” article that approached the Platonic ideal of having more errors than data points. The only thing that remained constant amid all the wreckage was . . . the conclusion.

6. And, most consequentially, there was the notorious “Excel error” paper, where fatal flaws were discovered and the authors dismissed this as an ““academic kerfuffle,” which isn’t quite “the conclusions are unchanged,” but close enough. Again, imagine if someone had published a null result and then, once the data had been fixed, a big estimate in their preferred direction had shown up. I think they would’ve said this was a big deal.

I’m happy to retell these above stories as often as might be needed–recall Paul Alper’s horse—; my point in this post is to give examples of the error-reversal heuristic.

P.S. Sometimes people do it right. Here’s an example where fatal flaws were found in a published paper, and the authors concluded, “A reanalysis of the data leads to revised findings that do not replicate the results in the original paper.” So it is possible.