Piranhas for “omics”?

Statistical Modeling, Causal Inference, and Social Science 2024-08-28

Don Vicendese writes:

I remember one of your long ago comments, perhaps from your blog, along the lines that you were wondering if, in outcomes that involve many contributory factors, whether these factors might cancel themselves out to a certain extent. [That’s the piranha theorem; see original blog post from 2017 and recent research paper. — ed.]

I have often thought about this especially in regard to exposome type studies – add what ever prefix you like in front of the “ome.”

I recently read about the mathematical result obtained by Michel Talagrand regarding outcomes that are a function of many contributing stochastic factors. He won the 2024 Abel prize for it. He was able to derive limits for the resultant variation in the outcome which show that this variation is quite constrained. There does seem to be a lot of canceling going on.

I had previously also thought about how in very high dimensional spaces, the data seem to be clumped together with much of the space empty. These high dimensional spaces could be all sorts of data, not just omics. This may explain why our statistical methods may struggle in these very high dimensional settings – they just haven’t got the resolution to penetrate this type of highly compacted data.

Given all that, I have also wondered whether, when we analyse outcomes and isolate a few contributory factors and find an association, that this association may vanish if we were able to consider a sufficiently higher number of factors that may cancel each other out. In other words, associations we may have found in a lower dimensional setting may have been “lucky” due to the lower dimensional setting.

I am just speculating here but at the same time this is somewhat worrying to me and I would be interested in your thoughts on the matter.

My response: I’m not sure! I had not heard of this result from Talagrand; a quick google turned up this lay summary, which indeed sounds similar to our piranha theorems (but much much more mathematically sophisticated, I’m sure). It would be cool if we could connect our piranha idea to some deep mathematics . . . . I’ve long felt that there are some deep ideas here, even if maybe I’m not the one to fully plumb their depths!

I also like Vicendese’s framing in terms of adding or removing factors. This seems like a good angle on the problem.

Finally, I know nothing about “omics,” but that could well be a good area of application of these ideas, given that there literally are hundreds of genes out there. To the extent that the goal of this piranha research is to help do better science (rather than just to give insight into the problems of certain ideas of bad science), it would make sense to study something real such as “omics” rather than silly things like the purported effect of time of the month on clothing choices, etc.