THE FIVE: Jeff Leek’s Challenge

Normal Deviate 2013-07-25

Jeff Leek, over at Simply Statistics asks an interesting question: What are the 5 most influential statistics papers of 2000-2010?

I found this to be incredibly difficult to answer. Eventually, I came up with this list:

Donoho, David (2006). Compressed sensing. IEEE Transactions on Information Theory. 52, 1289-1306.

Greenshtein, Eitan and Ritov, Ya’Acov. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli, 10, 971-988.

Meinshausen, Nicolai and Buhlmann, Peter. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34, 1436-1462.

Efron, Bradley and Hastie, Trevor and Johnstone, Iain and Tibshirani, Robert. (2004). Least angle regression. The Annals of statistics, 32, 407-499.

Hofmann, Thomas and Scholkopf, Bernhard and Smola, Alexander J. (2008). Kernel methods in machine learning. The Annals of Statistics. 1171–1220.

These are all very good papers. These papers had a big impact on me. More precisely, they are representative of ideas that had an impact on me. It’s more like there are clusters of papers and these are prototypes from those clusters. I am not really happy with my list. I feel like I must be forgetting some really important papers. Perhaps I am just getting old and forgetful. Or maybe our field is not driven by specific papers.

What five would you select? (Please post them at Jeff’s blog too.)