Mathematics of the impossible, Chapter 7, Space

Thoughts 2023-03-24

As mentioned in Chapter 2, Time is only one of several resources we with to study. Another important one is space, which is another name for the memory of the machine. Computing under space constraints may be less familiar to us than computing under time constraints, and many surprises lay ahead in this chapter that challenge our intuition of efficient computation.

If only space is under consideration, and one is OK with a constant-factor slackness, then TMs and RAMs are equivalent; much like $\text {P}$ is invariant under power changes in time. In a sense, changing the space by a constant factor is like changing the time by a power; from this point of view the equivalence is not surprising.

We shall consider both space bounds bigger than the input length and smaller. For the latter, we have to consider the input separately. The machine should be able to read the input, but not write on its cells. One way to formalize this is to consider 2TMs, where one tape holds the input and is read-only. The other is a standard read-write tape.

We also want to compute functions $f$ whose output is more than $1$ bit. One option is to equip the machine with yet another tape, which is write-only. We prefer to stick to two tapes and instead require that given $x,i$ the $i$ output bit of $f(x)$ is computable efficiently.

Definition 7.1. A function $f:X\subseteq \{0,1\}^* \to \{0,1\}^*$ is computable in $\text {Space}(s(n))$ if there is a 2TM which on input $(x,i)$ on the first tape, where $x\in X$ and $i\le |f(x)|$ , outputs the $i$ bit of $f(x)$ , and the machine never writes on the first tape and never uses more than $s(|x|)$ cells on the second.

We define:

$\begin{aligned} \text {L}:= & \bigcup _{d}\text {Space}(d\log n),\\ \text {PSpace}:= & \bigcup _{d}\text {Space}(n^{d}). \end{aligned}$

We investigate next some basic relationship between space and time.

Theorem 7.1. For every functions $t$ and $s$ :

$\text {\ensuremath {k}TM-Time}(t)\subseteq \text {Space}(c_{k}t)$ ,
$\text {Time}(t)\subseteq \text {Space}(ct\log (nt))$ ,
$\text {Space}(s)\subseteq 2\text {TM-Time}(c^{s(n)}\cdot n^{c})$ .

Proof. 1. A TM running in time $t$ can only move its heads at most $t$ tape cells. We can write all these contents in one tape. To simulate one step of the $k\text {TM}$ we do a pass on all the contents and update them accordingly.

2. A RAM running in time $t$ can only access $\le t$ memory cells, each containing at most $c\log nt$ bits; the factor $n$ is to take into account that the machine starts with word length $\ge \log n$ . We simulate this machine and for each Write operation we add a record on the tape with the memory cell index and the content, similar to Theorem ??. When the RAM reads from memory, we scan the records and read off the memory values from one of them. If the record isn’t found, then the simulation returns the value corresponding to the initial configuration.

We also allocate $c\log nt$ bits for the registers of the RAM. It can be shown that the operations among them can be performed in the desired space, since they only involve a logarithmic number of bits. A stronger result is proved later in Theorem 7.5.

3. On an input $x$ with $|x|=n$ , a $\text {Space}(s)$ machine can be in at most $n^{c}\cdot c^{s(n)}$ configurations. The first $n^{c}$ factor is for the head position on the input. The second factor is for the contents of the second tape. Since the machine ultimately stops, configurations cannot repeat, hence the same machine (with no modification) will stop in the desired time. QED

A non-trivial saving is given by the following theorem.

Theorem 7.2. [12] $k\text {-TM-Time}(t)\subseteq \text {Space}(c_{k}t/\log t)$ , for every function $t$ .

This result is not specific to MTMs but can be proved for other models such as RAMs.

From Theorem 7.1 and the next exercise we have

$\begin{aligned} \text {L}\subseteq \text {P}\subseteq \text {NP}\subseteq \text {PH}\subseteq \text {PSpace}\subseteq \text {Exp}. \end{aligned}$

Exercise 7.1. Prove $\text {PH}\subseteq \text {PSpace}$ .

Just like for Time, for space one has universal machines and a hierarchy theorem. The latter implies $\text {L}\ne \text {PSpace}$ . Hence, analogously to the situation for Time and NTime (section º5.1), we know that at least one of the inclusions above between L and PSpace is strict. Most people seem to think that all are, but nobody can prove that any specific one is.

7.1 Branching programs

Branching programs are the non-uniform counterpart of $\text {Space}$ , just like circuits are the non-uniform counterpart of $\text {Time}$ .

Definition 7.2. A (branching) program is a directed acyclic graph. A node can be labeled with an input variable, in which case it has two outgoing edges labeled $0$ and $1$ . Alternatively a node can be labeled with $0$ or $1$ , in which case it has no outgoing edges. One special node is the start node.

The space of the program with $S$ nodes is $\log S$ . A program computes a function $f:\zonzo$ by following the path from the starting node, following edge labels corresponding to the input, and outputting $b\in \{0,1\}$ as soon as it reaches a node labeled $b$ .

Theorem 7.3. Suppose a 2TM $M$ computes $f:\zonzo$ in space $s$ . Then $f$ has branching programs of space $c_{M}(s+\log n)$ . In particular, any $f\in \text {L}$ has branching programs of size $n^{c_{f}}$ .

Proof. Each node in the program corresponds to a configuration. QED

Definition 7.3. The branching program given by Theorem 7.3 is called the configuration graph of $M$ .

The strongest available impossibility result for branching programs is the following.

Theorem 7.4. [16] There are explicit functions (in $\text {P}$ ) that require branching programs of size $cn^{2}/\log n$ on inputs of length $n$ .

7.2 The power of L

Computing with severe space bounds, as in L, seems quite difficult. Also, it might be somewhat less familiar than, say, computing within a time bound. It turns out that L is a powerful class capable of amazing computational feats that challenge our intuition of efficient computation. Moreover, these computational feats hinge on deep mathematical techniques of wide applicability. We hinted at this in Chapter 1. We now give further examples. At the same time we develop our intuition of what is computable with little space.

To set the stage, we begin with a composition result. In the previous sections we used several times the simple result that the composition of two maps in P is also in P. This is useful as it allows us to break a complicated algorithm in small steps to be analyzed separately – which is a version of the divide et impera paradigm. A similar composition result holds and is useful for space, but the argument is somewhat less obvious.

Lemma 7.1. Let $f_{1}:\{0,1\}^* \to \{0,1\}^*$ be in $\text {Space}(s_{1})$ and satisfy $|f_{1}(x)|\le m(|x|)$ for a function $m$ . Suppose $f_{2}:\{0,1\}^* \to \{0,1\}$ is in $\text {Space}(s_{2})$ .

Then the composed function $g$ defined as $g(x)=f_{2}(f_{1}(x))$ is computable in space $c(s_{2}(m(n))+s_{1}(n)+\log nm(n))$ .

In particular, if $f_{1}$ and $f_{2}$ are in L then $g$ is in L, as long as $m\le n^{d}$ for a constant $d$ .

Exercise 7.2. Prove this.

7.2.1 Arithmetic

A first example of the power of L is given by its ability to perform basic arithmetic. Grade school algorithms use a lot of space, for example they employ space $\ge n$ to multiply two $n$ -bit integers.

Theorem 7.5. The following problems are in L:

Addition of two input integers.
Iterated addition: Addition of any number of input integers.
Multiplication of two input integers.
Iterated multiplication: Multiplication of any number of input integers.
Division of two integers.

Iterated multiplication is of particular interest because it can be used to compute “pseudorandom functions.” Such objects shed light on our ability to prove impossibility results via the “Natural Proofs” connection which we will see in Chapter ??.

Proof of 1. in Theorem 7.5. We are given as input $x,y\in \mathbb {N}$ and an index $i$ and need to compute bit $i$ of $x+y$ . Starting from the least significant bits, we add the bits of $x$ and $y$ , storing the carry of 1 bit in memory. Output bits are discarded until we reach bit $i$ , which is output. QED

Exercise 7.3. Prove 2. and 3. in Theorem 7.5.

Proving 4. and 5. is more involved and requires some of those deep mathematical tools of wide applicability we alluded to before. Division can be computed once we can compute iterated multiplication. In a nutshell, the idea is to use the expansion

$\begin{aligned} \frac {1}{x}=\sum _{i\ge 0}(-1)^{i}(x-1)^{i}. \end{aligned}$

We omit details about bounding the error. Instead, we point out that this requires computing powers $(x-1)^{i}$ which is an example of iterated multiplication (and in fact is no easier).

So for the rest of this section we focus on iterated multiplication. Our main tool for this the Chinese-remaindering representation of integers, abbreviated CRR.

Definition 7.4. We denote by $\mathbb {Z}_{m}$ the integers modulo $m$ equipped with addition and multiplication (modulo $m$ ).

Theorem 7.6. Let $p_{1},...,p_{\ell }$ be distinct primes and $m:=\prod _{i}p_{i}$ . Then $\mathbb {Z}_{m}$ is isomorphic to $\mathbb {Z}_{p_{1}}\times \ldots \times \mathbb {Z}_{p_{\ell }}$ .

The forward direction of the isomorphism is given by the map

$x \in \mathbb {Z}_{m} \rightarrow (x \mod p_{1},x \mod p_{2},...,x \mod p_{\ell }).$

For the converse direction, there exist integers $e_{1},...,e_{\ell }\leq (p')^{c}$ , depending only on the $p_{i}$ such that the converse direction is given by the map

$(x\mod p_{1},x\mod p_{2},...,x\mod p_{\ell }) \to x:=\sum _{i=1}^{\ell }e_{i}\cdot (x\mod p_{i}).$

Each integer $e_{i}$ is $0$ mod $p_{j}$ for $j\not =i$ and is $1$ mod $p_{i}$ .

Example 7.1. $\mathbb {Z}_{6}$ is isomorphic to $\mathbb {Z}_{2}\times \mathbb {Z}_{3}$ . The equation $2+3=5$ corresponds to $(0,2)+(1,0)=(1,2)$ . The equation $2\cdot 3=6$ corresponds to $(0,2)+(1,0)=(0,0)$ . Note how addition and multiplication in CRR are performed in each coordinate separately; how convenient.

To compute iterated multiplication the idea is to move to CRR, perform the multiplications there, and then move back to standard representation. A critical point is that each coordinate in the CRR has a representation of only $c\log n$ bits, which makes it easy to perform iterated multiplication one multiplication at the time, since we can afford to write down intermediate products.

The algorithm is as follows:

Computing the product of input integers $x_{1},\ldots ,x_{t}$ . [6]

Let $\ell :=n^{c}$ and compute the first $\ell$ prime numbers $p_{1},p_{2},\ldots ,p_{\ell }$ .
Convert the input into CRR: Compute $(x_{1}\mod p_{1},\ldots ,x_{1}\mod p_{\ell }),\ldots ,(x_{t}\mod p_{1},\ldots ,x_{t}\mod p_{\ell })$ .
Compute the multiplications in CRR: $(\Pi _{i=1}^{t}x_{i}\mod p_{1}),\ldots ,(\Pi _{i=1}^{t}x_{i}\mod p_{\ell })$ .
Convert back to standard representation.

Exercise 7.4. Prove the correctness of this algorithm.

Now we explain how steps 1, 2, and 3 can be implemented in L. Step 4 can be implemented in L too, but showing this is somewhat technical due to the computation of the numbers $e_{i}$ in Theorem 7.6. However these numbers only depend on the input length, and so we will be able to give a self-contained proof that iterated multiplication has branching programs of size $n^{c}$ .

Step 1

By Theorem ??, the primes $p_{i}$ have magnitude $\le n^{c}$ and so can be represented with $c\log n$ bits. We can enumerate over integers with $\le c\log n$ bits in L. For each integer $x$ we can test if it’s prime by again enumerating over all integers $y$ and $z$ with $\le c\log n$ bits and checking if $x=yz$ , say using the space-efficient algorithm for multiplication in Theorem 7.5. (The space required for this step would in fact be $c\log \log n$ .)

Step 2

We explain how given $y\in \{0,1\} ^{n}$ we can compute $(y\mod p_{1},\ldots ,y\mod p_{\ell })$ in L. If $y_{j}$ is bit $j$ of $y$ we have that

Step 3

This is a smaller version of the original problem: for each $j\leq \ell$ , we want to compute $(\Pi _{i=1}^{t}x_{i}\mod p_{j})$ from $x_{1}\mod p_{j},\ldots ,x_{t}\mod p_{j}$ . However, as mentioned earlier, each $(x_{i}\mod p_{j})$ is at most $n^{c}$ in magnitude and thus has a representation of $c\log n$ bits. Hence we can just perform one multiplication at the time in L.

Step 4

By Theorem 7.6, to convert back to standard representation from CRR we have to compute the map

$\begin{aligned} (y\mod p_{1},\ldots ,y\mod p_{\ell })\to \sum _{i=1}^{\ell }e_{i}\cdot (y\mod p_{i}). \end{aligned}$

Assuming we can compute the $e_{i}$ , this is just multiplication and iterated addition, which are in L by Theorem 7.5.

Putting the steps together

Combining the steps together we can compute iterated multiplication in L as long as we are given the integers $e_{i}$ in Theorem 7.6.

Theorem 7.7. Given integers $x_{1},x_{2},\ldots ,x_{t}$ , and given the integers $e_{1},e_{2},\ldots ,e_{\ell }$ as in Theorem 7.6, where $\ell =n^{c}$ , we can compute $\prod _{i}x_{i}$ in L.

In particular, because the $e_{i}$ only depend on the input length, but not on the $x_{i}$ they can be hardwired in a branching program.

Corollary 7.1. Iterated multiplication has branching programs of size $n^{c}$ .

Exercise 7.5. Show that given integers $x_{1},x_{2},\ldots ,x_{t}$ and $y_{1},y_{2},\ldots ,y_{t}$ one can decide if

$\begin{aligned} \prod _{i=1}^{t}x_{i}=\prod _{i=1}^{t}y_{i} \end{aligned}$

in L. You cannot use the fact that iterated multiplication is in L, a result which we stated but not completely proved.

Exercise 7.6. Show that the iterated multiplication of $d\times d$ matrices over the integers has branching programs of size $n^{c_{d}}$ .

7.2.2 Graphs

We now give another example of the power of L.

Definition 7.5. The undirected reachability problem: Given an undirected graph $G$ and two nodes $s$ and $t$ in $G$ determine if there is a path from $s$ to $t$ .

Standard time-efficient algorithms to solve this problem mark nodes in the graph. In logarithmic space we can keep track of a constant number of nodes, but it is not clear how we can avoid repeating old steps forever.

Theorem 7.8. [20] Undirected reachability is in L.

The idea behind this result is that a random walk on the graph will visit every node, and can be computed using small space, since we just need to keep track of the current node. Then, one can derandomize the random walk and obtain a deterministic walk, again computable in L.

7.2.3 Linear algebra

Our final example comes from linear algebra. Familiar methods for solving a linear system

$\begin{aligned} Ax=b \end{aligned}$

can be done requires a lot of space. For example using elimination we need to rewrite the matrix $A$ . Similarly, we cannot easily compute determinants using small space. However, a different method exists.

Theorem 7.9. [9] Solving a linear system is computable in $\text {Space}(c\log ^{2}n)$ .

7.3 Checkpoints

The checkpoint technique is a fundamental tool in the study of space-bounded computation. Let us illustrate it at a high level. Let us consider a graph $G$ , and let us write $u\leadsto ^{t}v$ if there is a path of length $\le t$ from $u$ to $v$ . The technique allows us to trade the length of the path with quantifiers. Specifically, for any parameter $b$ , we can break down paths from $u$ to $v$ in $b$ smaller paths that go through $b-1$ checkpoints. The length of the smaller paths needs be only $t/b$ (assuming that $b$ divides $t$ ). We can guess the breakpoints and verify each smaller path separately, at the price of introducing quantifiers but with the gain that the path length got reduced from $t$ to $t/b$ . The checkpoint technique is thus an instantiation of the general paradigm of guessing computation and verifying it locally, introduced in Chapter 5. One difference is that now we are only going to guess parts of the computation.

The checkpoint technique $u\leadsto ^{t}v\Leftrightarrow \exists p_{1},p_{2},\ldots ,p_{b-1}:\forall i\in \{0,1,b-1\}:p_{i}\leadsto ^{t/b}p_{i+1},$ where we denote $p_{0}:=u$ and $p_{b}:=v$ .

An important aspect of this technique is that it can be applied recursively: We can apply it again to the problems $p_{i}\leadsto ^{t/b}p_{i+1}$ . We need to introduce more quantifiers, but we can reduce the path length to $t/b^{2}$ , and so on. We will see several instantiations of this technique, for various settings of parameters, ranging from $b=2$ to $b=n^{c}$ .

We now utilize the checkpoint technique to show a simulation of small-space computation by small-depth alternating circuits.

Theorem 7.10. A function computable by a branching program with $S$ nodes is also computable by an alternating circuit of depth $c\log _{b}S$ and size $S^{b\log _{b}S+c}$ , for any $b\le S$ .

To illustrate the parameters, suppose $S=n^{a}$ , and let us pick $b:=n^{\epsilon }$ where $n\ge c_{a,\epsilon }$ . Then we have circuits of depth $d:=ca/\epsilon$ and size $\le S^{n^{\epsilon }d+c}=2^{n^{c\epsilon }}$ . In other words, we can have depth $d$ and size $2^{n^{c_{a}/d}}$ , for every $d$ . This in particular holds for every function in $\text {L}$ . We will later give explicit functions (also in $\text {P}$ ) which cannot be computed by circuits of depth $d$ and size $2^{n^{c/d}}$ , “just short” of ruling out $\text {L}$ . This state of affairs is worth emphasis:

(1) Every $f$ in $\text {L}$ has alternating circuits of depth $d$ and size $2^{n^{c_{f}/d}}$ . (2) We can prove that there are explicit functions (also in L) which cannot be computed by circuits of depth $d$ and size $2^{n^{c/d}}$ . (3) Improving the constant in the double exponent for a function in P would yield $\text {L}\ne \text {P}$ . In this sense, the result in (2) is the best possible short of proving a major separation.

Proof. We apply the checkpoint technique to the branching program, recursively, with parameter $b$ . For simplicity we first assume that $S$ is a power of $b$ . Each application of the technique reduces the path length by a factor $b$ . Hence with $\log _{b}S$ applications we can reduce the path length to $1$ .

In one application, we have an $\exists$ quantifier over $b-1$ nodes, corresponding to an Or gate with fan-in $S^{b-1}$ , and then a $\forall$ quantifier over $b$ smaller paths, corresponding to an And gate with fan-in $b$ . This gives a tree with $S^{b-1}\cdot b\le S^{b}$ leaves. Iterating, the number of leaves will be

$\begin{aligned} (S^{b})^{\log _{b}S}. \end{aligned}$

Each leaf can be connected to the input bit on which it depends. The size of the circuit is at most $c$ times the number of leaves.

If $S$ is not a power of $b$ we can view the branching program as having $S'\le bS$ nodes where $S'$ is a power of $b$ . QED

The following is a uniform version of Theorem 7.10, and the proof is similar. It shows that we can speed up space-bounded computation with alternations.

Theorem 7.11. [17]Any $f\in L$ is also in $\Sigma _{c_{f}/\epsilon }\text {Time}(n^{\epsilon })$ , for any $\epsilon >0$ .

Proof. Let $G$ be the configuration graph of $f$ . Note this graph has $n^{c_{f}}$ nodes. We need to decide if the start configuration reaches the accept configuration in this graph within $t:=n^{c_{f}}$ steps.

We apply to this graph the checkpoint technique recursively, with parameter $b:=n^{\epsilon /2}$ . Each application of the technique reduces the path length by a factor $b$ . Hence with $c_{f}/\epsilon$ applications we can reduce the path length to

$\begin{aligned} \frac {t}{b^{c_{f}/\epsilon }}=\frac {n^{c_{f}}}{n^{c_{f}/2}}\le 1. \end{aligned}$

Each quantifier ranges over $b\log n^{c_{f}}=c_{f}n^{\epsilon /2}\log n\le n^{\epsilon }$ bits for large enough $n$ .

There remains to check a path of length $1$ , i.e., an edge. The endpoints of this edge are two configurations $u$ and $v$ which depend on the quantified bits. The machine can compute the two endpoints in time $\log ^{c}m$ where $m$ is the total number of quantified bits, using rapid access. Once it has $u$ and $v$ it can check if $u$ leads to $v$ in one step by reading one bit from the input. Note $m\le n^{\epsilon }\cdot c_{f}/\epsilon$ , so $\log ^{c}m\le c_{f}n^{\epsilon }$ . QED

We note that by the generic simulation of alternating time by small-depth circuits in Lemma 6.2, the above theorem also gives a result similar to Theorem 7.10.

7.4 Reductions: L vs. P

Again, we can use reductions to related the space complexity of problems. In particular we can identify the problems in P which have space-efficient algorithms iff every problem in P does.

Definition 7.6. A problem $f$ is P-complete if $f\in \text {P}$ and $f\in \text {L}\Rightarrow \text {L}=\text {P}$ .

Definition 7.7. The circuit value problem: Given a circuit $C$ and an input $x$ , compute $C(x)$ .

Theorem 7.12. Circuit value is P-complete.

Proof. Circuit value is in P since we can evaluate one gate at the time. Now let $g\in \text {P}$ . We can reduce computing $g$ on input $x$ to a circuit value instance, as in the simulation of TMs by circuits in Theorem ??. The important point is that this reduction is computable in L. Indeed, given an index to a gate in the circuit, we can compute the type of the gate and index to its children via simple arithmetic, which is in L by Theorem 7.5, and some computation which only depends on the description of the P-time machine for $g$ . QED

Definition 7.8. The monotone circuit value problem: Given a circuit $C$ with no negations and an input $x$ , compute $C(x)$ .

Exercise 7.7. Prove that monotone circuit value is P-complete.

Recall from section 7.2.3 that finding solutions to linear systems

$\begin{aligned} Ax=b \end{aligned}$

has space-efficient algorithms. Interestingly, if we generalize equalities to inequalities the problem becomes P complete. This set of results illustrates a sense in which “linear algebra” is easier than “optimization.”

Definition 7.9. The linear inequalities problem: Given a $d\times d$ matrix $A$ of integers and a $d$ -dimensional vector, determine if the system $Ax\le b$ has a solution over the reals.

Theorem 7.13. Linear inequalities is P-complete.

Proof. The ellipsoid algorithm shows that Linear inequalities is in P, but we will not discuss this classic result.

Instead, we focus on showing how given a circuit $C$ and an input $x$ we can construct a set of inequalities that are satisfiable iff $C(x)=1$ .

We shall have as many variables $v_{i}$ as gates in the circuit, counting input gates as well.

For an input gate $g_{i}=x_{i}$ add equation $v_{i}=x_{i}$ .

For a Not gate $g_{i}=\text {Not}(g_{j})$ add equation $v_{i}=1-v_{j}$ .

For an And gate $g_{i}=\text {And}(g_{j},g_{k})$ add equations $0\le v_{i}\le 1,v_{i}\le v_{j},v_{i}\le v_{k},v_{j}+v_{k}-1\le v_{i}$ .

The case of Or is similar, or can be dispensed by writing an Or using Not and And.

Finally, if $g_{i}$ is the output gate add equation $v_{i}=1$ .

We claim that in every solution to $Av\le b$ the value of $v_{i}$ is the value in $\{0,1\}$ of gate $g_{i}$ on input $x$ . This can be proved by induction. For input and Not gates this is immediate. For an And gate, note that if $v_{j}=0$ then $v_{i}=0$ as well because of the equations $v_{i}\ge 0$ and $v_{i}\le v_{j}$ . The same holds if $v_{k}=0$ . If both $v_{j}$ and $v_{k}$ are 1 then $v_{i}$ is $1$ as well because of the equations $v_{i}\le 1$ and $v_{j}+v_{k}-1\le v_{i}$ . QED

7.4.1 Nondeterministic space

Because of the insight we gained from considering non-deterministic time-bounded computation in section º5.1, we are naturally interested in non-deterministic space-bounded computation. In fact, perhaps we will gain even more insight, because this notion will really challenge our understanding of computation.

For starters, let us define non-deterministic space-bounded computation. A naive approach is to define it using the quantifiers from section º6.4, leading to the class $\exists \cdot \text {L}$ . This is an ill-fated choice:

Exercise 7.8. Prove $\exists \cdot \text {L}=\exists \cdot \text {P}$ .

Instead, non-deterministic space is defined in terms of non-deterministic TMs.

Definition 7.10. A function $f:\{0,1\}^* \to \{0,1\}$ is computable in $\text {NSpace}(s(n))$ if there is a two-tape TM which on input $x$ never writes on the first tape and never uses more than $s(n)$ cells on the second, and moreover:

1. The machine is equipped with a special “Guess” state. Upon entering this state, a guess bit is written on the work tape under the head.

2. $f(x)=1$ iff there exists a choice for the guess bits that causes the machine to output $1$ .

We define

$\begin{aligned} \text {NL}:=\bigcup _{d}\text {NSpace}(d\log n). \end{aligned}$

How can we exploit this non-determinism? Recall from section 7.2.2 that reachability in undirected graphs is in L. It is unknown if the same holds for directed graphs. However, we can solve it in NL.

Definition 7.11. The directed reachability problem: Given a directed graph $G$ and two nodes $s$ and $t$ , decide if there is a path from $s$ to $t$ .

Theorem 7.14. Directed reachability $\text {is in NL}.$

Proof. The proof simply amounts to guessing a path in the graph. The algorithm is as follows:

“On input $G,s,t$ , let $v:=s$ .

For $i=0$ to $|G|$ :

If $v=t$ , accept.

Guess a neighbor $w$ of $v$ . Let $v:=w$ .

If you haven’t accepted, reject.”

The space needed is $|v|+|i|=c\log |G|$ . QED

We can define NL completeness similarly to NP and P completeness, and have the following result.

Theorem 7.15. Directed reachability is NL-complete. That is, it is in NL and it is in L iff $\text {L}=\text {NL}$ .

Exercise 7.9. Prove this.

7.4.2 Nspace vs. time

Recall by definition $\text {Space}(s(n))\subseteq \text {NSpace}(s(n))$ . We showed $\text {Space}(s(n))\subseteq \text {Time}(n^{c}c^{s(n)})$ in Theorem 7.1. We can strengthen the inclusion to show that it holds even for non-deterministic space.

Theorem 7.16. $\text {NSpace}(s(n))\subseteq \text {Time}(n^{c}c^{s(n)})$ .

Proof. On input x, we compute the configuration graph $G$ of $M$ on input $x$ . The nodes are the configurations, and there is an edge from $u$ to $v$ if the machine can go from $u$ to $v$ in one step. Then we solve reachability on this graph in power time, using say breadth-first-search. QED

The next theorem shows that non-deterministic space is not much more powerful than deterministic space: it buys at most a square. Contrast this with the P vs. NP question! The best determistic simulation of NP that we know is the trivial $\text {NP}\subseteq \text {Exp}$ . Thus the situation for space is entirely different.

How can this be possible? The high-level idea, which was used already in Lemma 7.1, can be cast as follows:

Unlike time, space can be reused.

Theorem 7.17. [23] $\text {NSpace}(s(n))\subseteq \text {Space}(cs^{2}(n))$ , for every $s(n)\ge \log n$ . In particular, $\text {NPSpace}=\text {PSpace}$ .

Proof. We use the checkpoint technique with parameter $b=2$ , and re-use the space to verify the smaller paths. Let $N$ be a non-deterministic TM computing a function in $\text {NSpace}(s(n))$ . We aim to construct a deterministic TM $M$ that on input $x$ returns

$\begin{aligned} \text {Reach}(C_{\text {start}},C_{\text {accept}},c^{s(n)}), \end{aligned}$

where $\text {Reach}(u,v,t)$ decides if $v$ is reachable from $u$ in $\le t$ steps in the configuration graph of $N$ on input $x$ , and $C_{\text {start}}$ is the start configuration, $C_{\text {accept}}$ is the accept configuration, and $c^{s(n)}$ is the number of configurations of $N$ .

The key point is how to implement Reach.

$\text {Computing\text { Reach}\ensuremath {(u,v,t)}}$

For all “middle” configurations $m$

$\text {If both \text {Reach}\ensuremath {(u,m,t/2)=1} and \text {Reach}\ensuremath {(m,v,t/2)=1} then Accept.}$

Reject

Let $S(t)$ denote the space needed for computing $\text {Reach}(u,v,t)$ . We have

$\begin{aligned} S(t)\le cs(n)+S(t/2). \end{aligned}$

This is because we can re-use the space for two calls to Reach. Therefore, the space for $\text {Reach}(C_{\text {start}},C_{\text {accept}},c^{s(n)})$ is

$\begin{aligned} \le cs(n)+cs(n)+\ldots +cs(n)\le cs^{2}(n). \end{aligned}$

QED

Recall that we do not know if $\text {Ntime}(t)$ is closed under complement. It is generally believed not to be the case, and we showed that if it is then the PH collapses Exercise 6.3.

What about space? Is $\text {NSpace}(s)$ closed under complement? Theorem 7.17 shows $\text {NSpace}(s(n))\subseteq \text {Space}(cs^{2}(n))$ . So if $f\in \text {NSpace}(s(n))$ then the complement function $1-f$ is in $\text {Space}(cs^{2}(n))$ (and in particular in $\text {Space}(cs^{2}(n))$ ). Hence, up to a quadratic loss in space, non-deterministic space is closed under complement.

Can we avoid squaring the space?

Yes! This is weird!

Theorem 7.18. The complement of Path is in NL.

Proof. We want a non-deterministic 2TM that given $G,s,\text {and }t$ accepts if there is no path from $s$ to $t$ in $G$ .

For starters, suppose the machine has computed the number $C$ of nodes reachable from $s$ . The key idea is that there is no path from $s$ to $t$ iff there are $C$ nodes different from $t$ reachable from $s$ . Thus, knowing $C$ we can solve the problem as follows

Algorithm for deciding if there is no path from $s$ to $t$ , given $C$ :

$\text {Initialize \text {Count}=0; Enumerate over all nodes \ensuremath {v\ne t}}$

$\text {Guess a path from \ensuremath {s} of length \ensuremath {|G|}. If path reaches \ensuremath {v}, increase \text {Count} by \ensuremath {1} }$

If $\text {Count}=C$ Accept, else Reject.

There remains to compute $C$ .

Let $A_{i}$ be the nodes at distance $\le i$ from $s$ , and let $C_{i}:=|A_{i}|$ . Note $A_{0}=\{s\},c_{0}=1$ . We seek to compute $C=C_{n}$ .

To compute $C_{i+1}$ from $C_{i}$ , enumerate nodes $v$ (candidate in $A_{i+1}$ ). For each $v$ , enumerate over all nodes $w$ in $A_{i}$ , and check if $w\to v$ is an edge. If so, increase $C_{i+1}$ by $1$ .

The enumeration over $A_{i}$ is done guessing $C_{i}$ nodes and paths from $s$ . If we don’t find $C_{i}$ nodes, we reject. QED

7.5 TiSp

So far in this chapter we have focused on bounding the space usage. For this, the TM model was sufficient, as remarked at the beginning. It is natural to consider algorithms that operate in little time and space. For this, of course, whether we use TMs or RAMs makes a difference.

Definition 7.12. Let $\text {TiSp}(t,s)$ be the functions computable on a RAM that on every input $x\in \{0,1\} ^{n}$ runs in time $t(n)$ and only uses memory cells $0$ to $s(n)-1$ .

Exercise 7.10. Prove $\text {L}=\bigcup _{d}\text {TiSp}(n^{d},d)$ .

An alternative definition of TiSp would allow the RAM to access $s(n)$ cells anywhere in memory. One can maintain a data structure to show that this alternative definition is equivalent to Definition 7.12.

To illustrate the relationship between TiSp, Time, and Space, consider undirected reachability. It is solvable in Time $(n\log ^{c}n)$ by bradth-first search, and in logarithmic space by Theorem 7.8. But it isn’t known if it is in $\text {TiSp}(n\log ^{a}n,a\log n)$ for some constant $a$ .

Exercise 7.11. Prove that Theorem 7.11 holds even for any function $f$ in $\text {TiSp}(n^{a},n^{1-\epsilon })$ for any $a$ and $\epsilon >0$ .

The following is a non-uniform version of TiSp.

Definition 7.13. A branching program of length $t$ and width $W$ is a branching program where the nodes are partitioned in $t$ layers $L_{1},L_{2},\ldots ,L_{t}$ where nodes in $L_{i}$ only lead to nodes in $L_{i+1}$ , and $|L_{i}|\le W$ for every $i$ .

Thus $t$ represents the time of the computation, and $\log W$ the space.

Recall that Theorem 7.10 gives bounds of the form $\ge cn^{2}/\log n$ on the size of branching program (without distinguishing between length and width). For branching programd og length $t$ and width $W$ this bound gives $t\ge cn^{2}/W\log n$ . But it gives nothing for power width like $W=n^{2}$ . The state-of-the-art for power width is $t\ge \Omega (n\sqrt {\log n/\log \log n})$ [2, 7] (in fact the bound holds even for subexponential) .

With these definitions in hand we can refine the connection between branching programs and small-depth circuits in Theorem 7.10 for circuits of depth 3.

Theorem 7.19. Let $f:\{0,1\} ^{n}\to \{0,1\}$ be computable by a branching program with width $W$ and time $t$ . Then $f$ is computable by an alternating depth-3 circuit with $\le 2^{c\sqrt {t\log W}}$ wires.

We will later show explicit functions that require depth-3 circuits of size $2^{c\sqrt {n}}$ . Theorem 7.19 shows that improving this would also improve results for small-width branching programs, a refinement of the message emphasized before after Theorem 7.10.

Exercise 7.12. Prove this.

7.6 Notes

For a discussion of the complexity of division, see [3]. For a compendium of P-complete problems see [11].

References

[1] Mikl≤s Ajtai. $\Sigma \sp {1}\sb {1}$ -formulae on finite structures. Annals of Pure and Applied Logic, 24(1):1–48, 1983.

[2] Mikl≤s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[3] Eric Allender. The division breakthroughs. Bulletin of the EATCS, 74:61–77, 2001.

[4] Noga Alon, Oded Goldreich, Johan Hσstad, and RenΘ Peralta. Simple constructions of almost $k$ -wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992.

[5] Kenneth E. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computing Conference, volume 32, pages 307–314, 1968.

[6] Paul Beame, Stephen A. Cook, and H. James Hoover. Log depth circuits for division and related problems. SIAM J. Comput., 15(4):994–1003, 1986.

[7] Paul Beame, Michael Saks, Xiaodong Sun, and Erik Vee. Time-space trade-off lower bounds for randomized computation of decision problems. J. of the ACM, 50(2):154–195, 2003.

[8] Stephen A. Cook. A hierarchy for nondeterministic time complexity. J. of Computer and System Sciences, 7(4):343–353, 1973.

[9] L. Csanky. Fast parallel matrix inversion algorithms. SIAM J. Comput., 5(4):618–623, 1976.

[10] Aviezri S. Fraenkel and David Lichtenstein. Computing a perfect strategy for n x n chess requires time exponential in n. J. Comb. Theory, Ser. A, 31(2):199–214, 1981.

[11] Raymond Greenlaw, H. James Hoover, and Walter Ruzzo. Limits to Parallel Computation: P-Completeness Theory. 02 2001.

[12] John E. Hopcroft, Wolfgang J. Paul, and Leslie G. Valiant. On time versus space. J. ACM, 24(2):332–337, 1977.

[13] Richard M. Karp and Richard J. Lipton. Turing machines that take advice. L’Enseignement MathΘmatique. Revue Internationale. IIe SΘrie, 28(3-4):191–209, 1982.

[14] Leonid A. Levin. Universal sequential search problems. Problemy Peredachi Informatsii, 9(3):115–116, 1973.

[15] Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993.

[16] E. I. Nechiporuk. A boolean function. Soviet Mathematics-Doklady, 169(4):765–766, 1966.

[17] Valery A. Nepomnjaščiĭ. Rudimentary predicates and Turing calculations. Soviet Mathematics-Doklady, 11(6):1462–1465, 1970.

[18] NEU. From RAM to SAT. Available at http://www.ccs.neu.edu/home/viola/, 2012.

[19] Wolfgang J. Paul, Nicholas Pippenger, Endre SzemerΘdi, and William T. Trotter. On determinism versus non-determinism and related problems (preliminary version). In IEEE Symp. on Foundations of Computer Science (FOCS), pages 429–438, 1983.

[20] Omer Reingold. Undirected connectivity in log-space. J. of the ACM, 55(4), 2008.

[21] J. M. Robson. N by N checkers is exptime complete. SIAM J. Comput., 13(2):252–267, 1984.

[22] Rahul Santhanam. On separators, segregators and time versus space. In IEEE Conf. on Computational Complexity (CCC), pages 286–294, 2001.

[23] Walter J. Savitch. Relationships between nondeterministic and deterministic tape complexities. Journal of Computer and System Sciences, 4(2):177–192, 1970.

[24] Michael Sipser. A complexity theoretic approach to randomness. In ACM Symp. on the Theory of Computing (STOC), pages 330–335, 1983.

[25] Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. on Computing, 20(5):865–877, 1991.

[26] Leslie G. Valiant and Vijay V. Vazirani. NP is as easy as detecting unique solutions. Theor. Comput. Sci., 47(3):85–93, 1986.

[27] N. V. Vinodchandran. A note on the circuit complexity of PP. Theor. Comput. Sci., 347(1-2):415–418, 2005.

[28] Emanuele Viola. On approximate majority and probabilistic time. Computational Complexity, 18(3):337–375, 2009.