Discrepancy Games and Sensitivity
Gödel’s Lost Letter and P=NP 2019-08-20
Can we connect the talks that closed this month’s Random Structures and Algorithms conference?
Cropped from NYU homepageJoel Spencer gave the closing talk of last week’s Random Structures and Algorithms conference at ETH Zurich.
Today we discuss his talk and the one that preceded it, which was by Hao Huang on his proof this month of the Boolean sensitivity conjecture.
Spencer’s talk was titled “Four Discrepancies” and based on a joint paper with Nikhil Bansal. The main new result in the talk was a case where a bound of arising from reasoning about normal distribution can, surprisingly, be improved to a sharp .
We will talk about this first, but then progress to the talk that preceded Spencer’s. It was by Hao Huang, who was extended an invitation right after his announced proof of the Boolean Sensitivity Conjecture earlier this month. Our ulterior purpose is to ask whether any concrete connections can be found besides both talks addressing problems on the hypercube.
Discrepancy Games
First suppose you get a stream of single bits , each or . For each bit, you can say “Keep it” or “Flip it.” In the latter case, your bit is . Your goal is to keep the sum within a bounded range. OK, this is easy: you can make the sum always be when is odd and when is even by flipping when needed.
Now suppose the bits come in pairs and your goal is to keep the sum in each coordinate bounded. Again there is a simple strategy: There are four kinds of vectors. For each kind, every time you see it for the second time, flip it. That keeps the contribution of each kind within in each coordinate. Thus simple reasoning says each coordinate stays within .
The trouble with extending this idea to vectors of length is that the number of kinds is and our simple-minded bounds, while constant in terms of the number of vectors given, are exponential in . Well, we can certainly do better. Going back to the case, we can maintain a policy of never letting both coordinates become or both become . This keeps both sums within regardless of the sequence of the vectors. But for larger vector lengths , how large can the discrepancy among the coordinate sums—relative to zero or to each other—become?
A final help is if we know the number of vectors in any sequence we might be given is bounded. In fact, Spencer’s paper with Bhansal first considers the case . There are two main questions about randomness:
- Does it matter whether the sequence of vectors given is random or worst-case?
- Can you do better than choosing “keep” or “flip” randomly?
Long back, Spencer proved that if you can see the whole sequence of vectors in advance, then you can always keep the sums within , whatever the sequence. If not—if you must decide “keep” or “flip” for each vector before seeing the next—then Spencer had also proved:
Theorem 1 If an adversary can choose the next vector based on your current sums, then the best bound such that you can keep all your sums within absolute value grows as . Moreover, up to the constant in the “” with high probability you cannot do any better than choosing the sign for each vector randomly.
See also his famous book, The Probabilistic Method, with Noga Alon. The new theorem is:
Theorem 2 If the adversary presents vectors uniformly at random, then you can achieve with high probability—and not by choosing the signs randomly.
The algorithm defines a kind of potential function and has you play “keep” or “flip” according to which has the lesser growth in potential. The analysis is quite complicated. As usual we refer to the paper for details. Instead we ask: are there insights to mine here for further advances on the Boolean sensitivity problem?
Boolean Sensitivity
We didn’t talk about Boolean sensitivity in our previous post on it. Our purpose now is to convey how many concepts are related, hence how they might connect to discrepancy on the hypercube.
To get the idea of sensitivity, first consider the parity function . If you change one bit of any argument you change the value of . Define to mean with bit flipped. Then for any and , . This means parity is extremely sensitive.
The OR function is intuitively less sensitive, but it too has an argument such that for all , namely . For any Boolean function define its sensitivity (at ) by
The OR-of-AND function is less sensitive. Say it is an OR of blocks, each of variables, and the blocks use disjoint variables so . If , then some block is all , so flipping any other bit makes no difference, and the most sensitive we can get is . If then the most sensitive case is when all blocks have exactly one 0. So . When , .
If we make each block of return true if exactly one bit is , then we again get sensitivity on the all- assignment. Now, however, consider the related function (with even, ) that is still an OR over blocks, but each block is true when some consecutive pair are both with all other pairs being both . Then we can’t do the same trick with the all- assignment changing just one bit. So when and we again get sensitivity (no more than) .
We can, however, consider to be more sensitive if we can flip more than one bit at a time. Partition into blocks of two consecutive bit-places each, and given any , define to be the result of flipping the bits in . We get blocks, and the all- assignment becomes a true case of if any block is flipped. Generally define the block sensitivity by considering any partitions of into disjoint subsets and writing
Note that not every member of the partition has to flip the function—we can discard the ones that don’t flip and count only the disjoint subsets that do flip the value. So back to our example, we have
Andris Ambainis and Xiaoming Sun improved the constant from asymptotically to , but their relation is still quadratic.
Some Boolean Complexity Connections
This example of quadratic discrepancy is still the best known lower bound on in terms of . But no one had proved anything better than an exponential upper bound until Huang’s result, from which it follows that:
This bound is concrete, not just asymptotic. It still leaves a gap between quadratic and quartic. It is, however, the combination of two quadratic upper bounds. One was shown by Nisan and Szegedy in their paper:
where means the degree of the unique multi-linear real polynomial that agrees with on the cube with for true. The other is the conjecture
in a 1992 paper by Craig Gotsman and Nati Linial. Huang proved this by exploiting a connection to graphs that was also shown by Gotsman and Linial. Consider any red-or-white coloring of the nodes of the hypercube, let be the graph induced by the red nodes, and define if node is red, otherwise. Now if the maximum degree of a node in is small then every red node has many white neighbors, so the Boolean function is very sensitive. However, going to each neighbor in the hypercube flips the parity. Hence the function
is not very sensitive. Moreover, it has the same sensitivity as the function where is true on the white nodes. This nice duality between and the graph induced by the white nodes enables us to fix “” to mean whichever of the two has more nodes in the following theorem statement:
Theorem 4 Provided , every graph induced by nodes of the -cube has if and only if every Boolean function has .
The proof uses some Fourier analysis with as above. It, too, takes only one page of a really short paper.
The main open question now is whether the 4th-power upper bound in Huang’s theorem 3 can be improved to quadratic. It is possible that a deeper application of Fourier analysis may show that cases of quadratic separation from block-sensitivity to degree and degree to sensitivity cannot “amplify” any more than quadratic. This is where there might be some commonality with discrepancy.
There is a much wider suite of Boolean complexity measures besides , , and discussed here. For example, consider how many bits of you need to fix in order to preserve the value . That is, define to be the maximum over of the minimum size of a set such that whenever agrees with on , . Clearly needs to include at least one bit from each This proves . There are many other relations that might be improved. We end by considering ways of broadening Huang’s techniques.
Weak Adjacency Matrices
We—that is Ken and I—have been thinking about how to build on Huang’s proof. We take a somewhat more-general approach compared to the paper and Ken’s post.
Definition 5 Let be a graph on vertices. The by matrix is a weak adjacency matrix of provided
- Every entry is bounded by in absolute value;
- For all if is not an edge in , then .
As usual the maximum degree of is denoted by . The following is almost the same proof as the classic result on adjacency matrices.
Lemma 6 (H-Lemma) Suppose that is a weak adjacency matrix of . Then if is an eigenvalue of ,
Proof: By the definition of eigenvalue, there is some non-zero vector so that . Let be an index so that maximum. Then
But then the last sum is upper bounded by
where is the number of so that is not zero. This uses property (1) of the definition of a weak adjacency matrix. Property (2) implies that is at most the degree of vertex is . Thus
Dividing by yields the lemma.
Let be the graph that results after we delete the vertex from . Let be the matrix that results after we delete the column and row from .
Lemma 7 Suppose that is a weak adjacency matrix of . Then for any vertex , is a weak adjacency matrix of .
Proof: The bound on the entries of is immediate. Whenever has no edge from to , then must be zero.
Lemma 8 Suppose that a -graph has a real symmetric weak adjacency matrix with of the top eigenvalues greater than . Then every induced subgraph of with at least vertices has degree at least .
Definition 9 The hypercube is the graph on bit vectors so that two vertices are adjacent if they differ in one bit position.
Theorem 10 The hypercube has a real symmetric weak adjacency matrix with all its eigenvalues .
Proof: Let be
and be
Induction shows that . It follows that
implies that
Thus
So the eigenvalues are and by trace same number of each.
The upper-left and lower-right blocks of correspond to the two dimensional subcubes of , and the two identity blocks correspond to the perfect matching connecting these two subcubes. So is a weak adjacency matrix for .
Corollary 11 Every induced subgraph of with at least vertices has degree at least .
Open Problems
Can the last two talks at the conference be further connected?