Progress on the Frontier
Gödel’s Lost Letter and P=NP 2018-03-12
An almost exponential improvement in bounds against ACC
Source from previous paperCody Murray is a PhD student of Ryan Williams at MIT. He and Ryan have a new paper that greatly improves Ryan’s separation of nonuniform circuits from uniform nondeterministic time classes. The previous best separation was from , that is, nondeterministic time . The new one is from , which is nondeterministic time . The ‘Q’ here means “quasi-polynomial,” not “quantum.”
Today we discuss the new ideas that led to this breakthrough on a previous breakthrough.
Since I live in Buffalo and my hometown Buffalo Bills are named for the frontiersman and showman William Cody, I can’t help making Wild West associations. The complexity frontier has now existed for over 60 years, a fourth of United States history, longer than the time from the Civil War to Arizona becoming the last contiguous state in 1912. So much is still unknown about it that any new avenue of progress commands attention.
The main idea calls to mind the classic movie The Searchers. In the movie the two principals are searching for an abducted small child. Here we are searching for a witness string to an -type predicate that is small in the following sense: it is the truth table of a small circuit . In the first applications using , the length of was exponential in the length of the input and the size of , while polynomial in , was (poly-)logarithmic in . In the new paper, all of the auxiliary size functions are related by operations under which polynomials are closed.
The small nature of the witness depends on strong hypotheses that the proof is ultimately trying to contradict. Ryan’s previous results used the powerful hypothesis . The present result starts by supposing . What we wish to emphasize first are the clever alterations to previous techniques that enable sweeping over the vast expanses of strings in a new way to reveal a new unconditional lower bound.
The “Almost Almost Everywhere” Hardness Condition
Let be a complexity function. We will think of as polynomial or quasi-polynomial but the reasoning works clear up to . With in mind, say that a language is “easy at length '' if there exists an -input circuit of size such that for all , . Say is “hard at '' otherwise. Consider the following conditions:
- There are infinitely many such that is hard at .
- For some polynomial and all , there is an such that and is hard at .
- For all but finitely many , is hard at .
When encodes a natural problem like , we might expect its hardness not to fluctuate with , so that these conditions have equal force. When (like ) is downward self-reducible, easiness at lengths just below implies easiness at . The easiness does slip to in place of , where is the number of queries in the self-reduction, but this still supports the intuition that easiness and hardness should be evenly spread.
To get a language that meet the third criterion, we can loop over Boolean functions with inputs until we find that has no circuit of size . Then define:
Almost-everywhere hardness may fail technically if, say, the encoding of as uses no odd-length strings, so that is trivially easy at odd . We could fill in the gap by using as the encoding, but this is ad hoc and might not work for other kinds of gaps. We could instead craft notions of being hard on a polynomially dense set , strengthening condition 2 by making easy to decide and denser. Against this backdrop the new “a.a.e.” condition used in the paper is short and neat:
Definition 1 is almost-almost-everywhere hard if there is a polynomial such that for all , either is hard at or is hard at .
We may relax to be a more general function, such as a polynomial composed with other bounds. We assume all bounds in question are increasing functions that are time-constructible or space-constructible as required.
Besides the diagonal language , the paper uses a special -complete set credited to Rahul Santhanam drawing on work by Luca Trevisan and Salil Vadhan. is downward self-reducible, paddable in that for all , and autoreducible in the following sense: there is a probabilistic polynomial-time oracle TM that on inputs makes only queries of length , always accepts when and is the oracle, and when rejects with probability at least 2/3 given any oracle. (The last clause prevents simply having query .)
The First Lower Bound
The new lower bound really comes from a new kind of upper bound using “Merlin-Arthur with advice.” A predicate related to a language has the Merlin-Arthur property if (informally speaking):
- If , then there exists a such that for most , holds.
- If , then there are no or such that holds.
One reason this double-barreled quantification enters the scene is that we still don’t know how to separate from , but we do know that the exponential-time Merlin-Arthur class is not in . The new tweak involves adding a quantifier but will allow dropping the time. It comes from exempting a small piece of from the second (“soundness”) condition, where depends only on the length of :
Definition 2 belongs to if there is a predicate decidable in time such that for all there is a string of length such that for all :
The point is that the condition for is allowed to fail for other strings . The string will in fact either be from the a.a.e. definition or will give the maximum length at which the language mentioned above has circuits of some size depending only on (so exists). Now we can state the lower bound, importing what we’ve said about and previously and just now:
Theorem 3 There are constants such that for all as above, and any auxiliary functions such that , , and , we can construct a language that is a.a.e.-hard.
The proof blends the constructions of and mentioned in the previous section. In particular, it uses the manner in which reduces to under appropriate padding. The reduction maps any string of length to a padded string of length in time. Note that the oracle TM that comes with obeys a kind of condition. We have all the pieces we need to carry out the diagonalization needed for a.a.e.-hardness while obtaining a Merlin-Arthur protocol with advice that works as follows on any input , :
- Advice unless is smaller.
- Merlin guesses a circuit of size with inputs.
- Arthur runs for a selected string :
- In the case , let , , and take . By the way is defined above and , this slice has no -sized circuits.
- In the case we parse as and put as before. If then reject, else we can take .
First note that we saved up to time by not computing . The latter case uses the magnitude not the length of in the padding. The proof analyzes the cases.
In the former case, because of how is defined with padding and , there is a circuit of size that decides at length , so Merlin can guess it. In the latter case, Merlin guesses for the length- slice of directly. The property of ensures that for all and the appropriate , either there is a leading Arthur to accept always or all make Arthur reject with probability at least 2/3, so the set of giving the former case defines a language in with the stated time and advice bounds.
The a.a.e. hardness follows from how the protocol either yields the length- slice of or implicitly maps the length- slice of it under the reduction to . In fact, the argument is more delicate and starts by negating both sides of the “a.a.e.” hardness definition for sake of contradiction. For that we refer to the paper.
The Easy-Witness Theorem
Let denote the “easy-witness” class of languages such that for all witness predicates for that are decidable in time , and all , there is a circuit of size whose graph is a string such that holds. It doesn’t much matter whether we restrict witnesses to have length a power of 2 or let the graph be for some . Let denote the class of languages with (nonuniform) circuits of size .
Theorem 4 There are universal constants and such that for every and time function , where :
The two triple compositions of (which is called in the paper) foreshadow the proof being a three-day ride. The proof again works by contradiction, and it helps that the negation of the “easy-witness” condition is concrete and helpful: it gives a verifier and an such that there are of length at most giving but none with small circuit complexity. In fact, we get this for infinitely many . The proof finally applies a theorem of Chris Umans that constructs a polynomial-time pseudorandom generator such that whenever has circuit complexity not less than and , all circuits of size give:
where is the output length of in terms of and the length of . This is where the constant comes in, while can be taken as divided by the exponent in the running time of . The generator is used to de-randomize the protocol. This yields a nondeterministic TM whose running time violates an application of the nondeterministic time hierarchy theorem, producing the desired contradiction.
The Roundup
The horses driving the final results come from various families of circuits, which we may suppose obey some simple closure properties. The nub is the speed with which a nondeterministic TMs can distinguish -input circuits in the following sense:
- If computes the all-zero function then some paths of say is “white” while all others say “another color.”
- If gives output for at least arguments then some paths say “black” while all others say “another color.”
- never has some paths that say “white” and others that say “black.”
Note that distinguishing the all-zero and quarter-dense cases is easy to do randomly in basically time, which converts to deterministic time under certain de-randomizing hypotheses. We only need to achieve this nondeterministically with a little advantage over the brute-force time (which cycles through assignments). The main theorem works for any such that :
Theorem 5
- If -input -circuits of size can be nondeterministically distinguished in time, then there is a such that for all , does not have size- circuits in .
- If -input -circuits of size can be nondeterministically distinguished in time, then for all there is a such that does not have size- circuits in .
The proof applies the easy-witness theorem to a particular verifier constructed by Eli Ben-Sasson and Emanuele Viola, and its easy witnesses lead the charge. By distinguishing the adversaries’ horse colors they lift their cover of darkness and drive them to a final contradiction shootout in the gulch of the nondeterministic time hierarchy theorem. In terms of the first statement’s target time happens to be , which is polynomial in , while the second statement’s time in terms of is , which is quasi-polynomial in .
Note that the second statement pulls a fast one: the order of quantifying and is switched. The gun it draws, however, was already in plain sight from earlier work by Ryan. Let denote circuits plus one layer of threshold gates at the inputs:
Theorem 6 For all there is an such that given circuits of modulus , depth , and size , whether they compute the all-zero function can be decided deterministically in time .
The sheriff holding this gun rides all the circuits out of the highlands of . And if a gunslinger NTM can be found to enforce the first clause in Theorem 5, a trail may open up for showing .
Open Problems
What other consequences follow from this march into new lower-bound territory? Already these movies are serials.
[fixed a subscript, m-vs-n in condition 2, and last sentence before “Open Problems”; fixed LaTeX in algorithm case; fixed Theorem 6.]