Mathematics of the impossible: Computational Complexity

Thoughts 2023-01-20

I am teaching and writing some notes on complexity. I hope they will become a book, so they are organized as such. The notes will be serialized on this blog, and you can find the latest version of the book in pdf here, which has a better rendering of tables, pictures, comic strips, etc.

0.1 Conventions, choices, and caveats

I write this section before the work is complete, so some of it may change.

This book covers basic results in complexity theory and can be used for a course on the subject. At the same time, it is perhaps sui generis in that it also tells a story of the quest for impossibility results, includes some personal reflections, and makes some non-standard choices about topics and technical details. Some of this is discussed next. $\medskip$

To test your understanding of the material…

this book is interspersed with mistakes, some subtle, some blatant, some not even mistakes but worrying glimpses into the author’s mind. Please send all bug reports and comments to (my five-letter last name)@ccs.neu.edu to be included in the list of heroes. $\medskip$

The $c$ notation.

The mathematical symbol $c$ has a special meaning in this text. Every occurrence of $c$ denotes a real number $>0$ . There exist choices for these numbers such that the claims in this book are (or are meant to be) correct. This replaces, is more compact than, and is less prone to abuse than the big-Oh notation (sloppiness hides inside brackets).

Example 0.1. “For all sufficiently large $n$ ” can be written as $n\ge c$ .

“For every $\epsilon$ and all sufficiently large $n$ ” can be written as $n\ge c_{\epsilon }$ .

The following are correct statements:

“It is an open problem to show that some function in NP requires circuits of size $cn$ .” At the moment of this writing, one can replace this occurrence with $5$ . Note such a claim will remain true if someone proves a $6n$ lower bounds. One just needs to “recompile” the constants in this book.

“ $c>1+c$ ”, e.g. assign $2$ to the first occurrence, $1$ to the second.

“ $100n^{15}<n^{c}$ ”, for="" all="" large="" enough="" $"" n$ .="" assign="" c="16$."

The following are not true:

“ $c<1/n$ for every $n$ ”. No matter what we assign $c$ to, we can pick a large enough $n$ . Note the assignment to $c$ is absolute, independent of $n$ .

More generally, when subscripted this notation indicates a function of the subscript. There exist choices for these functions such that the claims in this book are (or are meant to be) correct. Again, each occurrence can indicate a different function.

For the reader who prefers the big-Oh notation a quick an dirty fix is to replace every occurrence of $c$ in this book with $O(1)$ . $\medskip$

The alphabet of TMs.

I define TMs with a fixed alphabet. This choice slightly simplifies the exposition (one parameter vs. two), while being more in line with common experience (it is more common experience to increase the length of a program than its alphabet). This choice affects the proof of Theorem ??. But it isn’t clear that the details are any worse. $\medskip$

Partial vs. total functions (a.k.a. on promise problems).

Recall that promise problems offer the most direct way of formulating natural computational problems. […] In spite of the foregoing opinions, we adopt the convention of focusing on standard decision and search problems. [2]

I define complexity w.r.t. partial functions whereas most texts consider total functions, i.e. we consider computing functions with arbitrary domains rather than any possible string. This is sometimes called “promise problems.” This affects many things, for example the hierarchy for $\text {BPTime}$ (Exercise ??). $\medskip$

References and names.

I decided to keep references in the main text to a minimum, just to avoid having a long list later with items “Result X is due to Y,” but relegate discussion to bibliographic notes. I have also decided to not spell out names of authors, which is increasingly awkward. Central results, such as the PCP theorem, are co-authored by five or more people. $\medskip$

Polynomial.

It is customary in complexity theory to bound quantities by a polynomial, as in polynomial time, when in fact the only terms that matters is the leading time. This also lends itself to confusion since polynomials with many terms are useful for many other things. I use power instead of polynomial, as in power time. $\medskip$

Random-access machines.

“Random access” also leads to strange expressions like “randomized random-access” [1]. $\medskip$

Reductions.

Are presented as an implication. $\medskip$

Randomness and circuits.

While randomness and circuits are everywhere in current research, and seem to be on everyone’s mind, they are sometimes still relegated to later chapters, almost as an afterthought. This book starts with them right away, and attempts to weave them through the narrative. $\medskip$

Data structures

Their study, especially negative results, squarely belongs to complexity theory. Yet data structures are strangely omitted in common textbooks. Results on data structures even tend to miss main venues for complexity theory to land instead on more algorithmic venues! We hope this book helps to revert this trend. $\medskip$

Algorithms & Complexity

…are of course two sides of the same coin. The rule of thumb I follow is to present algorithms that are surprising and challenge our intuition of computation, and ideally match lower bounds, even though they may not be immediately deployed. $\medskip$

Exercises and problems.

Exercises are interspersed within the narrative and serve as “concept check.” They are not meant to be difficult or new, though some are. Problems are collected at the end and tend to be harder and more original, though some are not. $\medskip$

Summary of some terminological and not choices.

Here it is:

Some other sources this bookacronym

$O(1)$ , $\Omega (1)$ $c$

Turing machine tape machine TM

random-access machine rapid-access machine RAM

polynomial time power time P

mapping reduction (sometimes) $A$ reduces to $B$ in $\text {P}$ means $B\in \text {P}\Rightarrow A\in \text {P}$

Extended Church-Turing thesisPower-time computability thesis

pairwise independent pairwise uniform

FP, promise-P P

TM with any alphabet TM with fixed alphabet

classes have total functions classes have partial functions

$\medskip$

Chapter 1 A teaser

Consider a computer with three bits of memory. There’s also a clock, beating $1,2,3,\ldots$ In one clock cycle the computer can read one bit of the input and update its memory arbitrarily based on the value of the bit and the current memory, or stop and return a value.

Let’s give a few examples of what such computer can do.

First, it can compute the $\text {And}$ function on $n$ bits:

$\text {Computing And of (\ensuremath {x_{1}},\ensuremath {x_{2}},\ensuremath {\ldots },\ensuremath {x_{n}})}$

$\text {For }i=1,2,\ldots \text { until }n$

$\text {Read }x_{i}$

If $x_{i}=0$ return 0

$\text {Return }1$

We didn’t really use the memory. Let’s consider a slightly more complicated example. A word is palindrome if it reads the same both ways, like racecar, non, anna, and so on. Similarly, example of palindrome bit strings are $11,0110$ , and so on.

Let’s show that the computer can decide if a given string is palindrome quickly, in $n$ steps

$\text {Deciding if \ensuremath {(x_{1},x_{2},\ldots ,x_{n})} is palindrome:}$

$\text {For }i=1,2,\ldots \text { until }i>n/2$

$\text {Read \ensuremath {x_{i}} }$ and write it in memory bit $m$

If $m\ne x_{n-i}$ return 0

$\text {Return }1$

That was easy. Now consider the $\text {Majority}$ function on $n$ bits, which is $1$ iff the sum of the input bits is $>n/2$ and $0$ otherwise. Majority, like any other function on $n$ bits, can be computed on such a computer in time exponential in $n$ . To do that, you do a pass on the input and check if it’s all zero, using the program for And given above. If it is, return $0$ . If it is not, you do another pass now checking if it’s all zero except the last bit is $1$ . If it is, return $0$ . You continue this way until you exhausted all the $2^{n}/2$ possible inputs with Majority equals to $0$ . If you never returned $0$ you can now safely return $1$ .

As we said, this works for any function, but it’s terribly inefficient. Can we do better for Majority? Can we compute it in time which is just a power of $n$ ?

Exercise 1.1. Convince yourself that this is impossible. Hint: If you start counting bits, you’ll soon run out of memory.

If you solved the exercise, you are not alone.

And yet, we will see the following shocking result:

Shocking theorem:

Majority can be computed on such a computer in time $n^{c}$ .

And this is not a trick tailored to majority. Many other problems, apparently much more complicated, can also be solved in the same time.

But, there’s something possibly even more shocking.

Shocking situation:

It is consistent with our state of knowledge that every “textbook algorithm” can be solved in time $n^{c}$ on such a computer! Nobody can disprove that. (Textbook algorithms include sorting, maxflow, dynamic programming algorithms like longest common subsequence etc., graph problems, numerical problems, etc.)

The Shocking theorem gives some explanation for the Shocking situation. It will be hard to rule out efficient programs on this model, since they are so powerful and counterintuitive. In fact, we will see later that this can be formalized. Basically, we will show that the model is so strong that it can compute functions that provably escape the reach of current mathematics… if you believe certain things, like that it’s hard to factor numbers. This now enters some of the mysticism that surrounds complexity theory, where different beliefs and conjectures are pitted against each other in a battle for ground truth.

References

[1] Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. J. Comput. Syst. Sci., 18(2):155–193, 1979.

[2] Oded Goldreich. Computational Complexity: A Conceptual Perspective. Cambridge University Press, 2008.