Excellent performance of machine learning algorithms in a major time series competition. And then what is the role of statistical modeling? Here’s the answer:
Statistical Modeling, Causal Inference, and Social Science 2022-02-02
Kevin Gray writes:
Perhaps you’ve seen this but, in case not, it may be of interest.
Here’s some commentary by Rob Hyndman and a few others.
Curtains for Statistics? I’m struggling to rationalize this.
The article that Gray is pointing to is “The M5 Accuracy competition: Results, findings and conclusions,” by Spyros Makridakis, Evangelos Spiliotis, Vassilis Assimakopoulos, and here’s a key paragraph from its conclusions:
The exceptional performance of statistical methods versus ML [machine learning] ones found by Makridakis et al. (2018b), as well as in the early Kaggle competitions (Bojer & Meldgaard, 2020), first shifted towards ML and statistical methods in the M4 competition, and then to exclusively ML methods like in the Kaggle competitions which started in 2018 and the M5 described in this paper. It will be of great interest if ML methods continue to dominate statistical ones in future competitions, particularly for other types of data that are not exclusively related to hierarchical, retail sales applications.
I wouldn’t phrase it quite like this, as I consider the machine learning methods to also be statistical. They’re nonparametric, but I still consider them to be statistical forecasting methods.
I see several places where probability modeling is relevant here:
1. As noted above, machine learning predictions correspond to nonparametric statistical models, and the moment we try to quantify prediction error they become probability models.
2. Data and prediction problems typically have multilevel structure (data in different countries, different years, different product lines, etc.), which creates a need for partial pooling across levels. The alternative to partial pooling is sometimes-complete-pooling and sometimes-no-pooling, which is just a crude form of partial pooling. We’ve discussed this issue many times over the years, whether in the context of time-series cross-sectional studies in political science and economics, analysis of public opinion, or transportability in causal inference.
3. As we say in Regression and Other Stories, the three goals of statistics are generalization from sample to population, from treatment to control group, and from observed data to underlying constructs of interest. These require various versions of poststratification and latent-variable modeling.
4. Finally, when the performance of different predictions is still being evaluated using traditional statistical approaches. This would start with averages but soon move to probability modeling (to convert variation in data to uncertainty in expected performance) and poststratification (to assess performance for a population of problems of interest).