indirect proofs: contrapositives vs. proofs by contradiction

Sebastian Pokutta's Blog 2018-03-12

Last week I read a rather interesting discussion on contrapositives vs. proofs by contradiction as part of Timothy Gowers’ Cambridge Math Tripos, mathoverflow, and Terry Tao’s blog. At first sight these two concepts, the contrapositive and the reductio ad absurdum (proof by contradiction) might appear to be very similar. Suppose we want to prove A \Rightarrow B for some statements A and B. Then this is equivalent to showing \neg B \Rightarrow \neg A (at least in classical logic). The latter is the contrapositive and often it is easier to go with the contrapositive. In the case of the indirect proof we do something similar, however there is a slight difference: we assume that A \wedge \neg B and deduce a contradiction. So what the big deal? The difference seems to be more of a formal character. However this is not true. In the first case we remain in the space of “true statements”, i.e., any deduction from \neg B is a consequence of A that we can use later “outside of the proof”. In the case of the proof by contradiction we move in a “contradictory space” (as A \wedge \neg B  is contradictory) and everything that we derive in this space is potentially garbage. Its sole purpose is to derive a contradiction however as we work in a contradictory system we cannot guarantee that the statement derived within the proof are true statement; in fact they are likely not be true as they should result in a final contradiction.

Interestingly a similar phenomenon is known for cutting-plane procedures or cutting-plane proof systems (both terms essentially mean the same thing; it is just a different perspective) . Let me give you an ultra-brief introduction of cutting-plane procedures. Given a polytope P \subseteq [0,1]^n we are often interested in the integral hull of that polytope which is defined to be P_I = \text{conv}(P \cap \{0,1\}^n). A cutting-plane procedure M is now a map that assigns to P a new polytope M(P) such that P_I \subseteq M(P) \subseteq P and M(P) hopefully provides a tighter approximation of P_I. So what the cutting-plane procedure does, is to derive new valid inequalities for P_I by examining P and usually the derivation is computationally bounded (otherwise we could just guess the integral hull); the exact technical details are not too important at this point.

Now any well-defined cutting-plane procedure M satisfies M(P \cap \{cx \le \delta\}) \subseteq M(P) \cap \{cx \le \delta\}. Or put differently, giving the cutting-plane procedure access to an additional inequality can potentially increase the strength of the procedure as compared to let it work on P and then intersect with the half-space cx \leq \delta afterwards. Now what does this have to do with indirect proofs and contrapositives? The connection arises from the following trivial insight: an inequality cx \leq \delta (with integral coefficients and right-hand side) is valid for P_I if and only if (P \cap \{cx \geq \delta + 1 \})_I = \emptyset. In particular a sufficient condition for the validity of cx \leq \delta for P_I is M(P \cap \{cx \geq \delta + 1 \}) = \emptyset. The key point is that M(P \cap \{cx \geq \delta + 1 \}) can be strictly contained in M(P) \cap \{cx \geq \delta + 1 \}. The first one is the indirect proof, whereas the second one is the contrapositive, as we verify the validity of cx \leq \delta by testing if  M(P) \cap \{cx \geq \delta + 1 \} = \emptyset. However we do not use the inequality cx \leq \delta in the cutting-plane procedure, i.e., the procedure has no a priori knowledge about what to prove, whereas in the case of indirect proofs, we add the negation of cx \leq \delta and the procedure can use this information.

So how much do can you gain? Suppose we have a graph G= (V,E) and we consider the associated fractional stable set polytope FSTAB(G) = \{ x \in [0,1]^V \mid x_u + x_v \leq 1 \ \forall\; (u,v) \in E\}. Typically (there are a few exceptions), for a classical cutting-plane procedure the derivation of clique inequalities is involved and we need \Omega(\log k) applications of the cutting-plane procedure to derive the clique inequalities for a clique C of size k, i.e., \sum_{v \in C} x_v \leq 1. However an indirect proof of the clique inequalities takes only a single application of the most basic cutting-plane operator: Consider

FSTAB(G) \cap \{\sum_{v \in C} x_v \geq 2\} =: Q

for a clique C. It is not hard to see that Q \cap \{x_v = 1\} = \emptyset for all v \in C. A basic derivation that any sensible cutting-plane operator M supports is to derive that x_i \leq 0, i.e., x_i \leq 0 is valid for M(Q) whenever x_i < 1 is valid for P. Therefore we obtain that M(Q) \subseteq \bigcap_{v \in C} \{ x_v = 0\}. On the other hand M(Q) \subseteq \{\sum_{v \in C} x_v \geq 2\} and so M(Q) = \emptyset holds and thus the indirect proof derived \sum_{v \in C} x_v \leq 1.

So what one can see from this example is that indirect proofs (at least in the context of cutting-plane proof systems) can derive strong valid inequalities in rather few rounds and outperform their direct counterpart drastically (constant number of rounds vs. log(n) rounds). However a priori knowledge of what we want to prove is needed in order to apply the indirect proof paradigm. This makes it hard to exploit the power of indirect proofs in cutting-plane algorithms. After all, you need to know the “derivation” before you did the actual “derivation”. Nonetheless, in some cases we can use indirect proofs by guessing good candidates for strong valid inequalities and then verify their validity using an indirect proof.

Check out the links for further reading:

  1. http://gowers.wordpress.com/2011/10/05/basic-logic-relationships-between-statements-converses-and-contrapositives/
  2. http://mathoverflow.net/questions/12342/reductio-ad-absurdum-or-the-contrapositive
  3. http://terrytao.wordpress.com/2009/11/05/the-no-self-defeating-object-argument/