Lower bounds, data structures, rigidity, circuits, and all that

Thoughts 2025-04-09

A static data-structure problem is simply a function $f:[2]^{n}\to [q]^{m}$ . A data structure for $f$ with word-space $s$ , word size $w$ and time $t$ is a decomposition

$\begin{aligned} f(x)=h(g(x)) \end{aligned}$

where $g:[2]^{n}\to [2^{w}]^{s}$ , $h:[2^{w}]^{s}\to [q]^{m}$ , and each output bit coordinate of $h$ depends on $\le t$ input words. The basic case $w=q=2$ suffices for this post. For background and some pictures see e.g.˜my book [Vio23].

The lower bound “barrier” is

$\begin{aligned} t\ge \log (m/n)/\log (s/n), \end{aligned}$

that is, we have nothing better, not even for non-adaptive, etc. In fact, available lower bounds are off by a constant. If you haven’t seen this earlier, you might want to pause to realize how weak this is. For example, when $s$ is linear in $m$ we have nothing.

In [Vio19] I gave a problem for which $t\ge 3$ even for $m=n^{10}$ and $s=m/2$ . The problem is computing a small-bias generator [NN90].

A recent paper [KPI25] proves a new bound for the same function (as well as plain bounded independence). (The parameterization in [KPI25] is different from [Vio19]. Their inequality (1) corresponds to $2^{s}\ge m^{1/t}n^{1-1/t}$ with the letters I am using; taking logs and moving stuff around this is equivalent to the inequality above .) When I spotted “quadratic improvement” in their write-up I almost fell off my chair, but as they say “quadratic” refers to $s$ ; the improvement in terms of $t$ is only constant constant (almost 2). Their techniques look exciting and I am interested in learning more about them.

Connections between lower bounds in data structures and in circuits.

This is another point of contact between the works. For context, a few papers have investigated such connections. [MNSW98] proved that data-structure lower bounds imply certain branching program lower bounds, but the latter were then proved in [Ajt05]. Later, two papers were simultaneously submitted to the same conference [DGW19, Vio19]. (They both contain a connection to linear-size log-depth circuits, but that’s for a setting of parameters different from the rest, so I am not discussing it.)

[DGW19] shows that lower bounds for linear data structures imply rigid matrices. But rigid matrices such as those implied by their main connection were obtained in [AC19, BHPT20]. The latter works construct rigid matrices computable in exponential time with an NP oracle – which may not be the reader’s default for “efficient”. These constructions have nothing to do with data structures and are the natural next step in a series of works that can be traced back [Wil11]. A main step here is optimizing PCP constructions. This is related to some papers of mine. Quoting from [BHPT20]:

We view Theorem 1.3 [the new PCP] as continuing a line of work that explores the connection between the randomness of a PCP and the structure of its queries. A prominent advance in this direction is the work of Ben-Sasson and Viola [BV14]. They constructed short and efficient PCPs in which queries are a function of the input and a simple projection of the randomness (namely, a 1-local function: for a fixed input, each bit in each query location is a fixed bit of the randomness or its negation). Although the query structure in [BV14] (and follow-up [Vio]) is very simple, it is unclear whether their PCP is almost-rectangular or smooth—both playing a crucial role in our construction and its application.

In the paper [Vio] (in turn submitted to the same conference as [BHPT20], as well as the previous cycle) I used these PCP techniques to get new approximate degree lower bounds (for “efficient” functions as above). I got nearly maximum degree $n$ , improving on the $\sqrt {n}$ bound from the 80’s (for simple functions). It doesn’t seem to be well-known, and is explained in my paper, that this is actually a special case of, and so a prerequisite for, rigid matrices. In the paper I also asked for stronger rectangular-like PCPs that could give rigid matrices.

Returning to [Vio19], in that paper I gave an implication where the implied lower bounds are still open. I showed that in certain well-studied parameter regimes, data structure lower bounds imply new lower bounds on the number of wires in circuits with arbitrary gates, where progress has been stuck for decades, see e.g. [Juk12]. This implication has been used in [KW19] and by me to give new data-structures for error-correcting codes based on [GHK $^{+}$ 13], nearly matching the long-standing lower bound [GM07]. Specifically, in the paper I considered the setting $s\approx m$ . As mentioned above I proved $t\ge 3$ for $s=m/2$ . I also showed that for any $s$ linear in $m$ , proving an inverse Ackermann bound implies new circuit lower bounds. The connection is very simple and directly simulates circuits by data structures, making the lower bounds hold for the same hard function.

The recent work [KPI25] shows that for even larger $s\ge m-m^{o(1)}$ , even a $t\ge 4$ implies a new lower bound: an “efficient” function (as above) that’s not in NC1.

The future.

A question which I find very interesting and that remains open is to connect data structure lower bounds for other natural settings of parameters. For example, would a strong bound for $m=n^{10}$ , $s=n^{5}$ imply anything?

References

[AC19] Josh Alman and Lijie Chen. Efficient construction of rigid matrices using an NP oracle. In IEEE Symp. on Foundations of Computer Science (FOCS), 2019.

[Ajt05] Mikl�s Ajtai. A non-linear time lower bound for boolean branching programs. Theory of Computing, 1(1):149–176, 2005.

[BHPT20] Amey Bhangale, Prahladh Harsha, Orr Paradise, and Avishay Tal. Rigid matrices from rectangular PCPs. In IEEE Symp. on Foundations of Computer Science (FOCS), 2020.

[BV14] Eli Ben-Sasson and Emanuele Viola. Short PCPs with projection queries. In Coll. on Automata, Languages and Programming (ICALP), 2014.

[DGW19] Zeev Dvir, Alexander Golovnev, and Omri Weinstein. Static data structure lower bounds imply rigidity. In Moses Charikar and Edith Cohen, editors, ACM Symp. on the Theory of Computing (STOC), pages 967–978. ACM, 2019.

[GHK $^{+}$ 13] Anna G�l, Kristoffer Arnsfelt Hansen, Michal Koucký, Pavel Pudl�k, and Emanuele Viola. Tight bounds on computing error-correcting codes by bounded-depth circuits with arbitrary gates. IEEE Transactions on Information Theory, 59(10):6611–6627, 2013.

[GM07] Anna G�l and Peter Bro Miltersen. The cell probe complexity of succinct data structures. Theoretical Computer Science, 379(3):405–417, 2007.

[Juk12] Stasys Jukna. Boolean Function Complexity: Advances and Frontiers. Springer, 2012.

[KPI25] Oliver Korten, Toniann Pitassi, and Russell Impagliazzo. Stronger cell probe lower bounds via local prgs. Electron. Colloquium Comput. Complex., TR25-030, 2025.

[KW19] Young Kun Ko and Omri Weinstein. An adaptive step toward the multiphase conjecture, 2019.

[MNSW98] Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity. J. of Computer and System Sciences, 57(1):37 – 49, 1998.

[NN90] J. Naor and M. Naor. Small-bias probability spaces: efficient constructions and applications. In 22nd ACM Symp. on the Theory of Computing (STOC), pages 213–223. ACM, 1990.

[Vio] Emanuele Viola. New lower bounds for probabilistic degree and AC0 with parity gates. Theory of Computing. Available at http://www.ccs.neu.edu/home/viola/.

[Vio19] Emanuele Viola. Lower bounds for data structures with space close to maximum imply circuit lower bounds. Theory of Computing, 15:1–9, 2019. Available at http://www.ccs.neu.edu/home/viola/.

[Vio23] Emanuele Viola. Mathematics of the impossible: The uncharted complexity of computation. 2023.

[Wil11] Ryan Williams. Non-uniform ACC circuit lower bounds. In IEEE Conf. on Computational Complexity (CCC), pages 115–125, 2011.