Kolmogorov Complexity and Causation
Computational Complexity 2018-08-02
I got an interesting email question.
Suppose I give you a set of points S of the form (x,y). He suggested ideally they would be pairs of a real numbers. Supposing there is a causal relationship between x and y of some kind, we want to know know if it is more likely that the x value causes the y value or the y value causes the x value. One plausible way to decide what the answer should be is by answering the question is the length of the shortest program which maps the x values to their y values shorter than the length of the shortest program which maps the y values to their x values.
So, my intuition says that this is clearly undecidable. I'm actually having a hard time thinking of a proof, so do you happen to know of one or if this problem might actually be decidable?
On a related note, since I'm already writing you about this question, do you happen to know about the complexity of any related questions which involve circuit size instead of program length?Let's use notation from Kolmogorov complexity, letting C(x|y) be the size of the smallest program that takes y as input and outputs x. Now suppose it is decidable to determine whether C(x|y) > C(y|x). Then find an x of length n such that for all y of length n/3, C(x|y)>C(y|x). Such x exist: For any random x, C(x|y)>= 2n/3 and C(y|x) <= n/3. Now I claim C(x)>=n/4 for the x you found. If not we have C(x|y)<=C(x)<n/4 but for some y, C(y|x)>=n/3 since there aren't enough shorter programs to cover all the y's. Since there is no computable procedure to find x such that C(x)>=n/4, there can't be decidable procedure to determine whether C(x|y) > C(y|x). But does this question relate to causality. Pick a random x from strings of length n and y at random from strings of length n/3. We have C(x|y) > C(y|x) even though there is no causality. Instead you could look at the information of y in x, how many bit of x does y help describe, defined by I(y|x) = C(x)-C(x|y). This measure correlation since I(y|x)=0 iff x and y are independent but symmetry of information gives I(y|x)=I(x|y) so no hope for causation. In short, Kolmogorov complexity won't give you much on causation--you can't avoid the controlled experiments. For your last question, there is a notion of Kolmogorov complexity that roughly corresponds to circuit size, KT(x|y) defined as the sum of the program size and running time minimized over all programs p that take y as an input and output x. I'm guessing it's hard to determine if KT(x|y) < KT(y|x) and you could probably show it under some assumption like secure psuedorandom generators. Also symmetry of information isn't believed to hold for KT complexity so maybe there is something there. Interesting questions.