Fake data on the honeybee waggle dance, followed by the inevitable “It is important to note that the conclusions of our studies remain firm and sound.”

Statistical Modeling, Causal Inference, and Social Science 2024-11-07

I hadn’t thought about bee dancing for a long time, when someone pointed me to this post by Laura Luebbert and Lior Pachter on a bit of data fraud in biology. Luebbert writes:

Four years ago, during the first year of my PhD . . . I was assigned two classic papers on the honeybee waggle dance: “Visually Mediated Odometry in Honeybees” (Srinivasan et al., JEB 1997) and “Honeybee Navigation: Nature and Calibration of the ‘Odometer’” (Srinivasan et al., Science 2000). Since I was not familiar with honeybee behavior, I decided to expand my literature review to other papers on the topic, including “Honeybee Navigation En Route to the Goal: Visual Flight Control and Odometry” (Srinivasan et al., JEB 1996) and “How honeybees make grazing landings on flat surfaces” (Srinivasan et al., Biological Cybernetics 2000). While reading these papers, I sensed something strange; I had the feeling that I was looking at the same data over and over again.

It turns out that she was seeing the same data over and over again, a situation that reminded me of the story of economist Bruno Frey, who published something close to the exact same paper five times (motivating our update of Arrow’s Theorem) in five different journals.

This bee-dance thing was worse, though, because the identical data were claimed to be coming from different experiments!

Luebbert continues:

I was deeply concerned by these findings and presented them at the journal club meeting using animations and overlays to show that the data was indeed identical . . . I had imagined that the response to my presentation would be concern and advice on how to report my findings. Instead, both within and outside of Caltech, the response amounted to little more than a collective shrug.

This upsets me but does not surprise me. After all, my own institution, Columbia University, which houses so much wonderful teaching and research, also never did anything about their cheating on the U.S. News ranking or their professor of surgery whose papers were flagged for suspect data. So, yeah, I guess no surprise that the bee-dancing subfield of biology is no better than my employer in this respect.

Luebbert and Pachter put the details in this Arxiv paper, “The miscalibration of the honeybee odometer.” Amusingly—or, I should say, horrifyingly—enough, their article was rejected by a different preprint server, bioRxiv, which told them that they should “reformat it as a research paper presenting new results along with appropriate methods used, rather than simply a critique of existing literature.” Also, Pachter writes: “we were told that our manuscript contained ‘content with ad hominem attacks,’ even though it was merely a factual report of the issues we observed with appropriate citations of the affected papers, with no attack on any people or specific persons.”

The use of the term “ad hominem” in scientific discussions

“Ad hominem,” like “disingenuous,” is an expression that people use when they have nothing to say. It’s an attack in the guise of a defense, a pseudo-sophisticated phrase that, in effect, is roughly equivalent to: “What you say is true, and I don’t want to face up to it, so I’m trying to sidetrack this open scientific discussion by making it personal.”

I looked at Luebbert and Pachter’s paper, and it indeed contains nothing even close to an ad hominem attack (from Merriam Webster, “appealing to feelings or prejudices rather than intellect” or “marked by or being an attack on an opponent’s character rather than by an answer to the contentions made”). This is really disgraceful on the part of bioRxiv. I’d be interested in seeing the complete text of the message they sent to Luebbert and Pachter.

So, yeah, the usual story. Bad science, nobody cares. Indeed, worse than not caring, it’s an active anti-caring in the form of active efforts to suppress criticism. When there’s a problem, the reaction is to shoot the messenger.

It gets worse

Mandyam Srinivasan, the bee-dance researcher who had the duplicate data in his papers, made the mistake of responding (here and here) to Luebbert and Pachter’s post, and the response is a doozy.

Srinivasan characterizes the data issues as “typographical errors and minor oversights,” which is ridiculous if you actually look at the specifics of all the problems in those papers.

He also writes:

It is important to note that the conclusions of our studies remain firm and sound, and have been replicated independently in many subsequent studies from other reputable laboratories.

It’s funny how the conclusions never seem to be affected by revelations of major data problems. It kinda makes you wonder why these researchers bother gathering and analyzing data at all, given that their errors never affect their conclusions. This is consistent with the apparent attitude of some scientists that they already know the truth, with all this experimentation, analysis, and writing-up of results being a tiresome bit of paperwork between the theory and the professional acclaim.

The most annoying thing

I was particularly annoyed by this remark from Srinivasan:

I am surprised (and disappointed) with the unprofessional manner in which the authors of the arXiv document (Luebbert and Pachter) have conducted their commentary of some work in my laboratory. The authors never contacted me personally with their queries, nor did they seek clarification.

Grrrrr. Srinivasan published some papers. When you publish a paper, it is public. It’s out there for anyone to read, and for anyone to criticize. Correctness of a published paper is the author’s responsibility. There’s nothing wrong with contacting authors with your queries if you find problems with their paper, but also no requirement. Similarly, if you read a paper and want to use its methods, there’s nothing wrong with contacting the authors with your queries, but there is no requirement to do so. If you only want your paper read by people who’ve contacted you personally, don’t publish it. And if you don’t want anyone to criticize it, don’t publish it.

Personally, I like when people find mistakes in my published work. I make mistakes all the time (see here and here)! If the people who find mistakes in my work, or who think they find mistakes, contact me personally, that’s great, but the most important thing is that the mistakes (or possibly the points of confusion) get out there. And then I can go back and issue corrections to my errors (for example here, here, here, and here), improve my work (for example here), and clarify points of confusion (for example here).

Hey, Fermat!

As part of his reply to Srinivasan’s comments, Pachter wrote:

Finally if there are many reputable laboratories that have independently replicated the results in the 10 papers we flagged, it would be great if you could post links to all their papers and relevant figures in a reply to this comment.

To which Srinivasan responded:

A full, detailed, point-by-point response to the Luebbert and Pachter document will be available soon.

In Srinivasan’s last reply, he did not seem to be able to find space in the comment window for any references to the claimed many subsequent independent replications from other [sic] reputable laboratories.

Let’s look at the replications.

I wrote the above post in July and scheduled it for November. I guess that Srinivasan’s full, detailed, point-by-point response should be available by then, so we can see what he said.

Actually, I’m guessing that biologists would be less interested in the point-by-point response and more in the claim that their conclusions “have been replicated independently in many subsequent studies from other reputable laboratories.” Talk is cheap; replications are real. And it’s my impression that in biology, results of interest really do get replicated.

It could well be that Srinivasan’s findings are real, and really have been replicated. Just cos there’s research misconduct, it doesn’t mean you’re not doing real science. Indeed, this is kinda the flip side of the “honesty and transparency are not enough” principle. Scientific research can be honest, transparent, and wrong (this happens when you study weak theory with noisy measurements); I guess it can also be error-ridden in its details while still being on the right track.

So I’ll be interested to see what Luebbert, Pachter, and researchers in the bee-dance subfield think after looking at the references that Srinivasan supplies to the many replications from reputable laboratories.