Survey Statistics: probability samples vs epsem samples vs SRS samples

Statistical Modeling, Causal Inference, and Social Science 2025-12-02

As I mentioned last week, in 2021 I taught Survey Research Methods at NYU. The textbook by Groves et al. p.6 defines a “probability sample” as everyone having a known nonzero chance to be selected. So a non-probability sample has chances that are either unknown or zero (or both). I find this confusing, just because we don’t know something doesn’t mean it isn’t a probability… ?

Groves et al. p.103 defines Equal Probability SElection Method (“epsem”) as samples assigning equal probabilities to all individuals. The most famous example of epsem is Simple Random Sampling (“SRS”), where every possible sample of size n has the same probability. It is easy to confuse these 3 concepts (probability sample, epsem, SRS) so I drew this Venn diagram. Technically, we could have non-probability samples that are epsem or SRS and we just don’t know it ? I marked that as “strange coincidence”.

In Meng 2018 “Statistical Paradises and Paradoxes” p.697, they refer to “probabilistic sampling”:

Looking at Fuller (2011), I see a lot of theorems about stratified and Poisson sampling. I’m not sure where to look for general results about “probabilistic sampling”, or even how that is defined here ? Is it the same as from Groves et al. p.6 ?