Some references and discussions on the foundations of probability—not the math so much as its connection to the real world, including the claim that “Pr(aliens exist on Neptune that can rap battle) = .137”

Statistical Modeling, Causal Inference, and Social Science 2024-10-07

Someone pointed me to this recent post by Scott Alexander on probability and uncertainty where he makes the case that the Bayesian or probabilistic conception of uncertainty, for all its flaws, is better than the alternative of throwing up your hands and giving up. I mostly agree, with my only qualification being that, as with any other statistical approach, we just have to be careful not to let quantification of a problem lead to overconfidence. There, though, I’d still blame the overconfidence, not the quantification. I similarly wouldn’t recommend abandoning the consumer price index just because it’s not perfect, and indeed it’s ultimately impossible to summarize a countryful of price changes by any single number. Even in cases where I think a quantification is extremely misleading and in net worse than nothing, as with the notorious Electoral Integrity Index, that can be fine, because a number can be assessed and criticized based on the process used to create it.

Jessica Hullman has a relevant post in 2020 on this general topic: Can we stop talking about how we’re better off without election forecasting?

Along similar lines, I recommend chapter 1 of Bayesian Data Analysis (mostly written in 1995) and my short post, What is probability? from 2018.

If you’re particularly interested in election forecasts, you can take a look at this and this from a couple months ago. Relatedly, from 2023: On the ethics of pollsters or journalists or political scientists betting on prediction markets.

For probability and betting, here a few posts during the 2020 campaign season: Do we really believe the Democrats have an 88% chance of winning the presidential election?, So, what’s with that claim that Biden has a 96% chance of winning? (some thoughts with Josh Miller), More on that Fivethirtyeight prediction that Biden might only get 42% of the vote in Florida, Concerns with our Economist election forecast, and, finally, Comparing election outcomes to our forecast and to the previous election.

For a different real-world example of trying to informally summarize inconclusive information using a Bayesian approach, there’s this post from 2022: Thinking Bayesianly about the being-behind-at-halftime effect in basketball.

For a warning that you have to be careful if you start picking noisy numbers and using them to do Bayesian inference, I have posts from 2019: No, Bayes does not like Mayor Pete (Pitfalls of using implied betting market odds to estimate electability.) and 2023: Reverse-engineering the problematic tail behavior of the Fivethirtyeight presidential election forecast.

If you want to start thinking about non-Bayesian models of probability, I recommend my 2006 article, The boxer, the wrestler, and the coin flip: a paradox of robust Bayesian inference and belief functions or, at a more theoretical level, my 2021 Journal of Physics paper with Yuling Yao, Holes in Bayesian statistics.

My general picture of the foundations of probability is similar to my general picture of the foundations of mathematics, which is that you start from a clearly defined core and then work outward from there. In math, you start with the natural numbers, then you can define zero, rational numbers, irrational numbers (through the famous idea of “cutting” the set of rationals), then imaginary numbers, etc., with similar expansions of basic ideas of geometry, topology, etc. For probability, we start with physically-defined equally likely events such as ideal coin flips or die rolls, then from there we can define any continuous probability between 0 and 1 with desired accuracy using multiple coin flips (or die rolls if you prefer base 6 to binary), then we can define long-term frequencies using calibration, etc. As with mathematical models, the further you go from the core, the more you need to think about issues of measurement and applicability of the results. Probability, and also mathematics, is a framework for thinking about the world. It’s a tool, not an answer.

One more thing. In his post, Alexander writes:

Probabilities about AI are more lightly held than probabilities about Mars or impeachment (which in turn are more lightly held than the weatherman’s probability about whether it will rain tomorrow, which in turn is more lightly held than probabilities about coin flips). But I think the best way to represent your lightly held opinion is with a probability.

I agree, and I want to add something in support of Alexander’s statement, which is that the idea of some probabilities as being “more likely held” can itself be incorporated into the probabilistic or Bayesian framework. That is, we don’t need to qualify probabilities as being strongly or lightly held (for example, using a bold font to say that Pr(coin lands heads) = 0.5, while using a softer font to say that Pr(Harris is inaugurated president in January) = 0.5). The key idea, as I see it, is that a probability isn’t just a number; it’s part of a network of conditional statements. A probability can be best understood as part of a larger joint distribution.

I think Alexander will like that last post because it contains a serious discussion (and rebuttal) by Martha Smith of the reductio ad absurdum claim that “Pr(aliens exist on Neptune that can rap battle) = .137 is a valid ‘probability’ just because it satisfies mathematical axioms?” I’ve never met Alexander but I have the impression that he likes thinking about things such as “Pr(aliens exist on Neptune that can rap battle).” I mean that in a good way.