Code it! (patterns in data edition)

Statistical Modeling, Causal Inference, and Social Science 2024-11-16

Abigail Haddad writes:

I read your Iranian vote total post and I was thinking about how easy or difficult it is to find patterns in a list of numbers, assuming we’re not looking for any specific pattern. I agree there’s nothing special about being divisible by 3, but I don’t actually have a sense of how common it is for a list of numbers to have “something” in common. (And I know that’s not a well-defined problem, since it’s going to be really dependent on both what “something” includes and on the properties of the numbers—like, how long the list is, and the min and the max.)

Regardless, I wrote some code on this (or Claude-3.5 wrote some code on this, per my instructions). It’s totally silly. The patterns I included are pretty arbitrary. But if someone wanted to, they could put in all kinds of vote totals from places we believe with high probability do not have fraud, and then you could play with that. Or maybe there’s some other use of this.

I’ve not looked at the code and I don’t speak Python, but I’m sharing here as an example of the general principle that if you want to figure something out, you can make a lot of progress my coding something up than looking for some statistical formula. Not that coding is perfect—I have bugs in my code all the time, and it’s absolutely necessary to test it using fake-data simulation—; it’s just that code is rigorous. What you code is what you get.