I fear that many people are drawing the wrong lessons from the Wansink saga, focusing on procedural issues such as “p-hacking” rather than scientifically more important concerns about empty theory and hopelessly noisy data. If your theory is weak and your data are noisy, all the preregistration in the world won’t save you.

Statistical Modeling, Causal Inference, and Social Science 2018-03-13

Someone pointed me to this news article by Tim Schwab, “Brian Wansink: Data Masseur, Media Villain, Emblem of a Thornier Problem.” Schwab writes:

If you look into the archives of your favorite journalism outlet, there’s a good chance you’ll find stories about Cornell’s “Food Psychology and Consumer Behavior” lab, led by marketing researcher Brian Wansink. For years, his click-bait findings on food consumption have received sensational media attention . . .

In the last year, however, Wansink has gone from media darling to media villain. Some of the same news outlets that, for years, uncritically reported his research findings are now breathlessly reporting on Wansink’s sensational scientific misdeeds. . . .

So far, that’s an accurate description.

Wansink’s work was taken at face value by major media. Concerns about Brian Wansink’s claims and research methods had been known for years, but these concerns had been drowned out by the positive publicity—much of it coming directly from Wansink’s lab, which had its own publicity machine.

Then, a couple years ago, word got out that Wansink’s research wasn’t what it had been claimed to be. It started with some close looks at Wansink’s papers which revealed lots of examples of iffy data manipulation: you couldn’t really believe what was written in the published papers, and it was not clear what had actually been done in the research. The story continued when outsiders Tim van der Zee, Jordan Anaya, and Nicholas Brown found over 150 errors in four of Wansink’s published papers, and Wansink followed up by acting as if there was no problem at all. After that, people found lots more inconsistencies in lots more of Wansink’s papers.

This all happened as of spring, 2017.

News moves slowly.

It took almost another year for all these problems to hit the news, via some investigative reporting by Stephanie Lee of Buzzfeed.

The investigative reporting was excellent, but really it shouldn’t’ve been needed. Errors had been found in dozens of Wansink’s papers, and he and his lab had demonstrated a consistent pattern of bobbing and weaving, not facing these problems but trying to drown them in happy talk.

So, again, Schwab’s summary above is accurate: Wansink was a big shot, loved by the news media, and then they finally caught on to what was happening, and he indeed “has gone from media darling to media villain.”

But then Schwab goes off the rails. It starts with a misunderstanding of what went wrong with Wansink’s research.

Here’s Schwab:

His misdeeds include self-plagiarism — publishing papers that contain passages he previously published — and very sloppy data reporting. His chief misdeed, however, concerns his apparent mining and massaging of data — essentially squeezing his studies until they showed results that were “statistically significant,” the almighty threshold for publication of scientific research.

No. As I wrote a couple weeks ago, I fear that many people are drawing the wrong lessons from the Wansink saga, focusing on procedural issues such as “p-hacking” rather than scientifically more important concerns about empty theory and hopelessly noisy data. If your theory is weak and your data are noisy, all the preregistration in the world won’t save you.

To speak of “apparent mining and massaging of data” is to understate the problem and to miss the point. Remember those 150 errors in those four papers, and how that was just the tip of the iceberg? The problem is not that data were “mined” or “massaged,” the problem is that the published articles are full of statements that are simply not true. In several of the cases, it’s not clear where the data are, or what the data ever were. There’s the study of elementary school children who were really preschoolers, the pizza data that don’t add up, the carrot data that don’t add up, the impossible age distribution of World War II veterans, the impossible distribution of comfort ratings, the suspicious distribution of last digits (see here for several of these examples).

Schwab continues:

And yet, not all scientists are sure his misdeeds are so unique. Some degree of data massaging is thought to be highly prevalent in science, and understandably so; it has long been tacitly encouraged by research institutions and academic journals.

No. Research institutions and academic journals do not, tacitly or otherwise, encourage people to report data that never happened. What is true is that research institutions and academic journals rarely check to see if data are reasonable or consistent. That’s why it is so helpful that van der Zee, Anaya, and Brown were able to run thousands of published papers through a computer program to check for certain obvious data errors, of which Wansink’s paper had many.

Schwab writes:

I wonder if we’d all be a little less scandalized by Wansink’s story if we always approached science as something other than sacrosanct, if we subjected science to scrutiny at all times, not simply when prevailing opinion makes it fashionable.

That’s a good point. I think Schwab is going too easy on Wansink—I really do think it’s scandalous when a prominent researcher publishes dozens of papers that are purportedly empirical but are consistent with no possible data. But I agree with him that we should be subjecting science to scrutiny at all times.

P.S. In his article Schwab also mentions power-pose researcher Amy Cuddy. I won’t get into this except to say that I think he should also mention Dana Carney—she’s the person who actually led the power-pose study and she’s also the person who bravely subjected her own work to criticism—and Eva Ranehill, Magnus Johannesson, Susanne Leiberg, Sunhae Sul, Roberto Weber, and Anna Dreber, who did the careful replication study that led to the current skeptical view of the original power pose claims. I think that one of the big problems with science journalism is that researchers who make splashy claims get tons of publicity, while researchers who are more careful don’t get mentioned.

I think Schwab’s right that the whole Wansink story is unfortunate: First he got too much positive publicity, now he’s getting too much negative publicity. The negative publicity is deserved—at almost any time during the past several years, Wansink could’ve defused much of this story by simply sharing his data and being open about his research methods, but instead he repeatedly attempted to paper over the cracks—but it personalizes the story of scientific misconduct in a way that can be a distraction from larger issues of scientists being sloppy at best and dishonest at worst with their data.

I don’t know the solution here. On one hand, here Schwab and I are as part of the problem—we’re both using the Wansink story to say that Wansink is a distraction from the larger issues. On the other hand, if we don’t write about Wansink, we’re ceding the ground to him, and people like him, who unscrupulously seek and obtain publicity for what is, ultimately, pseudoscience. It would’ve been better if some quiet criticisms had been enough to get Brian Wansink and his employers to clean up their act, but it didn’t work that way. Schwab questions Stephanie Lee’s journalistic efforts that led to smoking-gun-style emails—but it seems like that’s what it took to get the larger world to listen.

Let’s follow Schwab’s goal of “subjecting science to scrutiny at all times”—and let’s celebrate the work of van der Zee, Anaya, Brown, and others who apply that scrutiny. And if it turns out that a professor at a prestigious university who’s received millions of dollars from government and industry and who’s received massive publicity for purportedly empirical results that are not consistent with any possible data, then, yes, that’s worth reporting.

The post I fear that many people are drawing the wrong lessons from the Wansink saga, focusing on procedural issues such as “p-hacking” rather than scientifically more important concerns about empty theory and hopelessly noisy data. If your theory is weak and your data are noisy, all the preregistration in the world won’t save you. appeared first on Statistical Modeling, Causal Inference, and Social Science.