Using trends in R-squared to measure progress in criminology??
Statistical Modeling, Causal Inference, and Social Science 2013-06-08
Torbjørn Skardhamar writes:
I am a sociologist/criminologist working at Statistics Norway. As I am not a trained statistician, I find myself sometimes in need to check basic statistical concepts. Recently, I came across an article which I found a bit strange, but I needed to check up on my statistical understanding of a very basic concept: the r-squared. When doing so, I realized that this was also an interesting case of research ethics. Given your interest in research ethics, I though this might be interesting to you.
Here’s the mentioned article, by Weisburd and Piquero, is attached. What they do is to analyzed reported results from all articles published in the highest ranking criminological journal since 1968 through 2005 to determine whether there are any progress in the field of criminology. Their approach is basically to calculate the average r-square from linear models in published articles. For example, they state that “variance explained provides one way to assess the state of the science of criminology and its relevance for public policy, and how that science has changed over time” (page 455, final paragraph). They find that the “explained variance” is generally low – and even on the decline, so there has not been much progress, and they conclude: “That criminology is not developing models of crime with more explanatory power over time is troubling” (page 491, first sentence).
I needed to look up in my old statistical text books to find out if interpreting the r-squared statistics in this way made much sense. I think it doesn’t, but I’m not entirely sure if there might be some circumstances where it might be meaningful after all. Perhaps if the sole purpose is predicting a specific phenomenon? (But that is usually not the purpose at all).
The research ethical issue is related to the statistical issue. While trying to find out if I had misunderstood something about r-squard, I came across Gary King’s article (1986) “How to not lie with statistics” where he also discusses the direct interpretation of r-squared. If I got it right, King argues that r-squared is only meaningful for comparing models on the same data with the same outcome variable. As far as I can understand, then, King’s argument implies that calculating the average r-squared across studies does not make much sense. I also found your blog posts with related arguments: http://andrewgelman.com/2007/08/rsquared_useful/ and http://andrewgelman.com/2012/10/r-squared-of-1/
Then I realized that Weisburd and Piquero actually cites King’s article (on page 464), but only on a side note that one can easily manipulate to get a higher r-squared. As Weisburd and Piquero cite King’s article, we must assume they have read the main arguments too. But it appears as if they just ignore King’s main arguments. The ethical issue is then when authors delibrately ignore highly relevant arguments that might undermine their own publication. In this case, it looks like Weisburd and Piquero puts forward an argument that they know (or should know) does not hold. At the least, they should have discussed the counter arguments properly.
By the way, it should be noted that Weisburd and Piquero are much cited criminologists having some impact in the criminological literature. So they are not making beginners’ mistakes.
My reply: I do think that R-squared can be a useful summary of a fitted model. But I question the premise that progress in criminology research would be characterized by an increase of explained variance. When I think of my own applied research, progress is typically not measured by pure prediction. It’s often important to work on problems where individual outcomes are difficult to predict.
The post Using trends in R-squared to measure progress in criminology?? appeared first on Statistical Modeling, Causal Inference, and Social Science.