indirect proofs: contrapositives vs. proofs by contradiction
Sebastian Pokutta's Blog 2018-03-12
Last week I read a rather interesting discussion on contrapositives vs. proofs by contradiction as part of Timothy Gowers’ Cambridge Math Tripos, mathoverflow, and Terry Tao’s blog. At first sight these two concepts, the contrapositive and the reductio ad absurdum (proof by contradiction) might appear to be very similar. Suppose we want to prove for some statements
and
. Then this is equivalent to showing
(at least in classical logic). The latter is the contrapositive and often it is easier to go with the contrapositive. In the case of the indirect proof we do something similar, however there is a slight difference: we assume that
and deduce a contradiction. So what the big deal? The difference seems to be more of a formal character. However this is not true. In the first case we remain in the space of “true statements”, i.e., any deduction from
is a consequence of
that we can use later “outside of the proof”. In the case of the proof by contradiction we move in a “contradictory space” (as
is contradictory) and everything that we derive in this space is potentially garbage. Its sole purpose is to derive a contradiction however as we work in a contradictory system we cannot guarantee that the statement derived within the proof are true statement; in fact they are likely not be true as they should result in a final contradiction.
Interestingly a similar phenomenon is known for cutting-plane procedures or cutting-plane proof systems (both terms essentially mean the same thing; it is just a different perspective) . Let me give you an ultra-brief introduction of cutting-plane procedures. Given a polytope we are often interested in the integral hull of that polytope which is defined to be
. A cutting-plane procedure
is now a map that assigns to
a new polytope
such that
and
hopefully provides a tighter approximation of
. So what the cutting-plane procedure does, is to derive new valid inequalities for
by examining
and usually the derivation is computationally bounded (otherwise we could just guess the integral hull); the exact technical details are not too important at this point.
Now any well-defined cutting-plane procedure satisfies
. Or put differently, giving the cutting-plane procedure access to an additional inequality can potentially increase the strength of the procedure as compared to let it work on
and then intersect with the half-space
afterwards. Now what does this have to do with indirect proofs and contrapositives? The connection arises from the following trivial insight: an inequality
(with integral coefficients and right-hand side) is valid for
if and only if
. In particular a sufficient condition for the validity of
for
is
. The key point is that
can be strictly contained in
. The first one is the indirect proof, whereas the second one is the contrapositive, as we verify the validity of
by testing if
. However we do not use the inequality
in the cutting-plane procedure, i.e., the procedure has no a priori knowledge about what to prove, whereas in the case of indirect proofs, we add the negation of
and the procedure can use this information.
So how much do can you gain? Suppose we have a graph and we consider the associated fractional stable set polytope
. Typically (there are a few exceptions), for a classical cutting-plane procedure the derivation of clique inequalities is involved and we need
applications of the cutting-plane procedure to derive the clique inequalities for a clique
of size
, i.e.,
. However an indirect proof of the clique inequalities takes only a single application of the most basic cutting-plane operator: Consider
for a clique . It is not hard to see that
for all
. A basic derivation that any sensible cutting-plane operator
supports is to derive that
, i.e.,
is valid for
whenever
is valid for
. Therefore we obtain that
. On the other hand
and so
holds and thus the indirect proof derived
.
So what one can see from this example is that indirect proofs (at least in the context of cutting-plane proof systems) can derive strong valid inequalities in rather few rounds and outperform their direct counterpart drastically (constant number of rounds vs. log(n) rounds). However a priori knowledge of what we want to prove is needed in order to apply the indirect proof paradigm. This makes it hard to exploit the power of indirect proofs in cutting-plane algorithms. After all, you need to know the “derivation” before you did the actual “derivation”. Nonetheless, in some cases we can use indirect proofs by guessing good candidates for strong valid inequalities and then verify their validity using an indirect proof.
Check out the links for further reading: