Why waste time philosophizing?

Statistical Modeling, Causal Inference, and Social Science 2013-03-15

I’ll answer the above question after first sharing some background and history on the the philosophy of Bayesian statistics, which appeared at the end of our rejoinder to the discussion to which I linked the other day:

When we were beginning our statistical educations, the word ‘Bayesian’ conveyed membership in an obscure cult. Statisticians who were outside the charmed circle could ignore the Bayesian subfield, while Bayesians themselves tended to be either apologetic or brazenly defiant. These two extremes manifested themselves in ever more elaborate proposals for non-informative priors, on the one hand, and declarations of the purity of subjective probability, on the other.

Much has changed in the past 30 years. ‘Bayesian’ is now often used in casual scientific parlance as a synonym for ‘rational’, the anti-Bayesians have mostly disappeared, and non-Bayesian statisticians feel the need to keep up with developments in Bayesian modelling and computation. Bayesians themselves feel more comfortable than ever constructing models based on prior information without feeling an obligation to be non-parametric or a need for priors to fully represent a subjective state of knowledge.

In short, Bayesian data analysis has become normalized. Our paper is an attempt to construct a philosophical framework that captures applied Bayesian inference as we see it, recognizing that Bayesian methods are highly assumption-driven (compared to other statistical methods) but that such assumptions allow more opportunities for a model to be checked, for its discrepancies with data to be explored.

We felt that a combination of the ideas of Popper, Kuhn, Lakatos, and Mayo covered much of what we were looking for – a philosophy that combined model building with constructive falsification – but we recognize that we are, at best, amateur philosophers. Thus we feel our main contribution is to consider Bayesian data analysis worth philosophizing about.

Bayesian methods have seen huge advances in the past few decades. It is time for Bayesian philosophy to catch up, and we see our paper as the beginning, not the end, of this process.

OK, now to the question of the day: Who cares? A natural question or response to all this is to declare it a waste of time. Every moment spent philosophizing is a moment not spent doing real research. Why philosophize? Why not just do?

The philosophy of statistics is interesting to me. But, beyond this, my reason for writing about it is that the philosophy of statistics can affect the practice of statistics. The connection is clearest to me in the area of model comparison and checking. Bayesians have been going around with models that don’t fit the data, not even looking for anything better, out of a belief that Bayesian models are (a) completely subjective and thus (b) to be trusted completely. This combination never made sense to me—I’d think that the more subjective a model is, the less you’d want to trust it—but it was central to the inferential philosophy that was dominant among Bayesians when I was starting out. I think that this unfortunate philosophy restricted what people were actually doing in practice. It was making them worse data analysts and worse scientists. Conversely, the Popperian philosophy of falsification encouraged me in my efforts to include model checking and exploratory data analysis into the folds of Bayesian inference (see, for example, here and here).

So, yes, conditional on not changing your methods, philosophy is at best a diversion from the real work of science. But I think philosophy is more important than that. Good philosophical work can free us from our ruts and point us toward new and better statistical methods.

And I’m not just talking about the past, it’s not just about confusions that we’ve already dispelled.

For example, one open question now is: How can an Artificial Intelligence do statistics? In the old-fashioned view of Bayesian data analysis as inference-within-a-supermodel, it’s simple enough: an AI (or a brain) just runs a Stan-like program to learn from the data and make predictions as necessary. But in a modern view of Bayesian data analysis—iterating the steps of model-building, inference-within-a-model, and model-checking—here, it’s not quite clear how the AI works. It needs not just an inference engine, but also a way to construct new models and a way to check models. Currently, those steps are performed by humans, but the AI would have to do it itself, without the aid of a “homunculus” to come up with new models or check the fit of existing ones. This philosophical quandary points to new statistical methods, for example a language-like approach to recursively creating new models from a specified list of distributions and transformations, and an automatic approach to checking model fit, based on some way of constructing quantities of interest and evaluating their discrepancies from simulated replications.

I don’t know how to do all this—it’s research!—but my point is that philosophy, even if not strictly necessary, can help, both in the negative sense of clearing away bad and confusing ideas and in the positive sense of suggesting new ways forward.