The data are on a 1-5 scale, the mean is 4.61, and the standard deviation is 1.64 . . . What’s so wrong about that??

Statistical Modeling, Causal Inference, and Social Science 2024-04-19

James Heathers reports on the article, “Contagion or restitution? When bad apples can motivate ethical behavior,” by Gino, Gu, and Zhong (2009):

There is some sentiment data reported in Experiment 3, which seems to be reported in whole units.

They also indicated how guilty they would feel about the behavior of the person who took all the money along with some unrelated emotional measures (1 = not at all, 5 = very much)… participants in the in-group selfish condition felt more guilty (M = 4.61, SD = 1.64) about the person’s selfish behavior than the participants in the out-group selfish condition (M = 3.26, SD = 1.54), t(80) = 3.82, p < .001.

If you have a 1 to 5 scale, it isn’t possible to have M = 4.61, SD = 1.64.

Huh? Really? Yeah!

Let’s work it out. If your measurements are on a 1-5 scale, the way to maximize their standard deviation for any given mean is to put the data all at 1 and 5. If the mean is 4.61, that would imply that (4.61 – 1)/(5 – 1) = 0.9025 of the data take on the value 5, and 1 – 0.9025 = 0.0975 take on the value 1. (Just to check, 0.0975*1 + 0.9025*5 = 4.61.)

For this extreme dataset, the standard deviation is sqrt(0.0975*(1 – 4.61)^2 + 0.9025*(5 – 4.61)^2) = 1.19. So, yeah, there’s no way to get a standard deviation of 1.64 from these data. Just not possible!

Just to make sure, we can check our calculation via simulation:

n <- 1e6
y <- sample(c(1,5), n, replace=TRUE, prob=c(0.0975, 0.9025))
print(c(mean(y), sd(y)))

Here's what we get:

[1] 4.610172 1.186317

Check.

OK, let's try one more thing. Maybe b is so small that there's some kinda 1/sqrt(n-1) thing in the denominator driving the result? I don't think so. The trouble is that, to get a mean of 4.61, you need enough data (in his post, Heathers guesses "n=41 (as 189/41 = 4.6098)") that the difference between 1/sqrt(n) and 1/sqrt(n-1) wouldn't be enough to take you from 1.19 all the way up to 1.64 or even close. Also, it's kinda implausible that all the observations would be 1's and 5's anyway.

So what happened?

It's always easier to figure out what didn't happen than to figure out what did happen.

Here are some speculations.

One possibility is a typo, but Heathers doubts that because other calculations in that paper are consistent the above-reported impossible numbers.

A related possibility is that this was a typo that was then propagated into the rest of paper. For example, the mean was 3.61, it was typed in the paper as 4.61, and then this typed-in number was used in later calculations. This would be bad workflow---you want all the computations to be done in a single script---but people use bad workflow all the time. I use bad workflow myself sometimes and end up with wrong numbers or wrongly-labeled graphs.

Another possibility is that the mean and standard deviation were calculated from two different datasets. That might sound kind of weird, but it can happen all the time, due to sloppiness or because of goofs in data processing. For example, you read in the data, calculate the mean and standard deviation for each variable, then perform some data-exclusion rule, perhaps removing data with incomplete responses to some of the questions, then you do further statistical analysis, recalculating the mean and standard deviation, among other things---but then when you pull together your numbers, you take the mean from some place and the standard deviation from the other place.

Yet another possibility is that someone involved in the data analysis or writeup was cheating in order to get a statistically-significant and thus publishable result, for example changing 3.61 to 4.61 to get a big fat difference but not touching the standard deviation. This would be a great way to cheat, because if you get caught, you can just say that you made a typo!

In any case, it's a fun little statistics example. And it's worth checking your data, even if you have no suspicion of cheating. I've often had incoherent data in problems I've worked on. Lots of things can go wrong in data processing and analysis, and we have to check things in all sorts of ways.