“Is machine learning a subset of statistics?”

Statistical Modeling, Causal Inference, and Social Science 2013-03-15

Following up on our previous post, Andrew Wilson writes:

I agree we are in a really exciting time for statistics and machine learning. There has been a lot of talk lately comparing machine learning with statistics. I am curious whether you think there are many fundamental differences between the fields, or just superficial differences — different popular approximate inference methods, slightly different popular application areas, etc. Is machine learning a subset of statistics?

In the paper we discuss how we think machine learning is fundamentally about pattern discovery, and ultimately, fully automating the learning and decision making process. In other words, whatever a human does when he or she uses tools to analyze data, can be written down algorithmically and automated on a computer. I am not sure if the ambitions are similar in statistics — and I don’t have any conventional statistics background, which makes it harder to tell. I think it’s an interesting discussion.

My reply:

I don’t know enough about machine learning to know what differences there are between the fields. One of my sayings is that theoretical statistics is another name for the theory of applied statistics. That is, statistics is all about modeling what we do, and modeling what we should be doing. As always in the social sciences, normative modeling has a descriptive flavor and descriptive modeling has a normative flavor: to the extent that we’re not doing what we say we should be doing, this suggests potential changes in our theory or in our practice. And much of my work over the years has been to give theoretical foundations for various areas of statistical practice that have typically been treated informally.

Thus, compared to other academic statisticians, I think I spend more time monitoring convergence of my iterative simulations, checking the fit of my models, and graphing data and fitted curves—but at the same time I do these things more formally than many statisticians have been trained to do. I think that some of the research we’ve been discussing lately on automatic model construction (done by people other than me, let me emphasize!) is important in that is moving toward a better description—and thus also a better normative theory—of model building. To me, it’s a big step forward from that thing where “learning a model” is associated with taking a big multivariate dataset and trying to identify conditional independence structures. To me, all that stuff is static, and I’m much happier with a framework in which models are built out recursively in a language-like fashion.

That said, for now this is all a sideshow. We still have a ways to go in fitting models that we’ve already specified. Hence, Stan.

Are we at the stage of “fully automating the learning and decision making process”? I don’t think so. But the only way forward is to try, not getting too stuck in our current understanding at any time.