Locality Bounds for Sampling Hamming Slices

Thoughts 2024-02-25

The complexity of distributions (tag) has seen a lot of activity and exciting progress, including [Vio12bLV12Vio14DW11BIL12BCS14Vio12c,  Vio20CGZ21Vio23YZ24FLRS23RS24DMK24] and connections to extractors [CZ16Li16CS16Coh16BDT16], data structures, and quantum-classical separations [WP23].

One of the problems considered in [Vio12b] is that of sampling the uniform distribution D_{k} over strings of weight k=\Theta (n) with small locality or depth. Two bounds are proved in [Vio12b]. One proves a moderate statistical-distance bound, the other proves a tight statistical distance bound approaching one, but has a restriction on the input length of the sampler.

Digression.

This restriction appears to have stuck, possibly also because the original post in Goldreich’s choices only mentioned that result and emphasized the restriction (after correction, his post now starts with the other result which doesn’t have the restriction). For example, during a talk I gave at IAS about the subsequent AC0 result [Vio14], the editor which had desk rejected the paper (i.e., rejected the paper quickly providing no feedback or explanation) was still asking about the input length restriction. I said there is no such restriction. I should have added “I hope that’s not why it got desk-rejected ;-)” but I wasn’t quick enough.

Back to sampling.

The general problem of sampling D_{k} was mentioned again as open in [FLRS23]. In fact, using an approach that was developed earlier [Vio12a], one can prove strong bounds in the special case that k is non-dyadic, i.e., not of the form k=n/2^{t}. (For non-dyadic you can expect bounds which are exponentially close to 1, whereas for dyadic you can’t, since for example the uniform distribution is already 1/n^{c}-close to D_{n/2}.) I informed the authors of [FLRS23] about the argument, and I added it to a version of [Vio23] which is under submission since August. The proof is 1/2 page. The write-up uses a simple lemma from [Vio23] that was online since 2021, whose proof is also 1/2 page. All the ideas are from [Vio12a]. See Section 9 here.

But the problem of proving bounds for the dyadic case without the input length restriction remained open. I am excited that a recent paper [DMK24] solves precisely that problem. The authors of [DMK24] were not aware of previous work, so they made exaggerated claims while in fact most of the other results in [DMK24] are weaker than results that are either explicit in or follow easily from previous work. Their paper is 51 pages long.

Yet my proof was indirectly communicated to them by the referees, and I am thankful to [DMK24] for doing the right thing, that is, sending me their paper and discussing it before posting it online. At some point however they must have thought that enough was enough. They stopped the discussion, and posted online on the arxiv and ECCC. The posted version still contains false claims…

Mind reading is a difficult art, but in this case I will try my hand. I don’t think this version would have been accepted to STOC, and in fact I don’t think the authors would have even bothered submitting it.

But the puzzling question is: why nobody on the program committee bothered to check with me?

The future.

A central open problem in this area is that of proving negative results in the cell-probe model. Here the picture changes dramatically, the dyadic vs. non-dyadic distinction disappears, and strong sampling bounds can only hold with an input-length restriction (which is enough for data-structure applications). Can you rule out k=n/3 and 10 cell probes?

References

[BCS14]    Itai Benjamini, Gil Cohen, and Igor Shinkar. Bi-lipschitz bijection between the boolean cube and the hamming ball. In IEEE Symp. on Foundations of Computer Science (FOCS), 2014.

[BDT16]    Avraham Ben-Aroya, Dean Doron, and Amnon Ta-Shma. Explicit two-source extractors for near-logarithmic min-entropy. Electronic Colloquium on Computational Complexity (ECCC), 23:88, 2016.

[BIL12]    Chris Beck, Russell Impagliazzo, and Shachar Lovett. Large deviation bounds for decision trees and sampling lower bounds for AC0-circuits. Electronic Colloquium on Computational Complexity (ECCC), 19:42, 2012.

[CGZ21]    Eshan Chattopadhyay, Jesse Goodman, and David Zuckerman. The space complexity of sampling. Electron. Colloquium Comput. Complex., page 106, 2021.

[Coh16]    Gil Cohen. Making the most of advice: New correlation breakers and their applications. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 188–196, 2016.

[CS16]    Gil Cohen and Leonard J. Schulman. Extractors for near logarithmic min-entropy. Electronic Colloquium on Computational Complexity (ECCC), 23:14, 2016.

[CZ16]    Eshan Chattopadhyay and David Zuckerman. Explicit two-source extractors and resilient functions. In ACM Symp. on the Theory of Computing (STOC), pages 670–683, 2016.

[DMK24]   Kewen Wu Daniel M. Kane, Anthony Ostuni. Locality bounds for sampling hamming slices. In ACM Symp. on the Theory of Computing (STOC), 2024.

[DW11]    Anindya De and Thomas Watson. Extractors and lower bounds for locally samplable sources. In Workshop on Randomization and Computation (RANDOM), 2011.

[FLRS23]   Yuval Filmus, Itai Leigh, Artur Riazanov, and Dmitry Sokolov. Sampling and certifying symmetric functions. In APPROX-RANDOM, 2023.

[Li16]    Xin Li. Improved two-source extractors, and affine extractors for polylogarithmic entropy. In IEEE Symp. on Foundations of Computer Science (FOCS), 2016.

[LV12]    Shachar Lovett and Emanuele Viola. Bounded-depth circuits cannot sample good codes. Computational Complexity, 21(2):245–266, 2012.

[RS24]    Jad Silbak Ronen Shaltiel. Explicit codes for poly-size circuits and functions that are hard to sample on low entropy distributions. In ACM Symp. on the Theory of Computing (STOC), 2024.

[Vio12a]    Emanuele Viola. Bit-probe lower bounds for succinct data structures. SIAM J. on Computing, 41(6):1593–1604, 2012.

[Vio12b]    Emanuele Viola. The complexity of distributions. SIAM J. on Computing, 41(1):191–218, 2012.

[Vio12c]    Emanuele Viola. Extractors for turing-machine sources. In Workshop on Randomization and Computation (RANDOM), 2012.

[Vio14]    Emanuele Viola. Extractors for circuit sources. SIAM J. on Computing, 43(2):355–972, 2014.

[Vio20]    Emanuele Viola. Sampling lower bounds: boolean average-case and permutations. SIAM J. on Computing, 49(1), 2020. Available at http://www.ccs.neu.edu/home/viola/.

[Vio23]    Emanuele Viola. New sampling lower bounds via the separator. In Conf. on Computational Complexity (CCC), 2023. Available at http://www.ccs.neu.edu/home/viola/.

[WP23]    Adam Bene Watts and Natalie Parham. Unconditional quantum advantage for sampling with shallow circuits. CoRR, abs/2301.00995, 2023.

[YZ24]    Huacheng Yu and Wei Zhan. Sampling, flowers and communication. In ACM Innovations in Theoretical Computer Science conf. (ITCS), 2024.