Princeton Is Invariant

Gödel’s Lost Letter and P=NP 2018-08-02

Workshop happening this week—anyone can view it live

IAS Weyl bio source

Hermann Weyl was one of the first members of the Institute for Advanced Study in Princeton. He made important contributions to many fields and even more contributions to groups. His work in group representation theory and polynomial invariant theory is being employed in a workshop being held this week at the Institute on “Optimization, Complexity, and Invariant Theory.” Weyl wrote a lovely book called Levels Of Infinity. One of the interesting chapters is on “Why is the world four-dimensional?” Indeed. Today we discuss some topics from the workshop. The talks are free and readers are invited to follow the remaining proceedings via the live link provided by IAS.

We have mentioned Weyl a few times, including telling the story of an interview in which the great physicist Paul Dirac was asked,

Do you ever run across a fellow that even you can’t understand?

Dirac named Weyl.

Most of us in the audience are not physicists or even pure mathematicians but rather computer scientists by trade. We are trying to understand not only Weyl but also bits of Michael Atiyah and Frances Kirwan and David Mumford and Issai Schur. And David Hilbert, who added great nonconstructive power to Arthur Cayley’s invariant theory, then sought to recover it constructively. Plus work by many other mathematicians and physicists, especially quantum physicists. Invariants in physics correspond to real-world quantities like energy, momentum, and charge(-parity(-time reversal)). Thus while invariant theory is new to us it has been part of the IAS since its inception—an invariant of Princeton.

Invariants, Unity, and Diversity

Avi Wigderson is the workshop’s main organizer and he laid out the plan for the week in two sweeping introductory talks on Monday morning. His backbone example is how six seemingly disparate problems in various areas of math and physics and combinatorial optimization are structurally the same problem. He has been part of several joint papers which collectively improve the upper bounds on these and related problems. The notions of resources and the tradeoff of speed and closeness in approximations unite and focus the approaches.

A key concept is how a polynomial (or rational) function {f(x) = f(x_1,\dots,x_n)} changes under certain applications of invertible matrices {A} to the input vector {x}. Consider simply {f_A(x) = f(Ax)}. The set of such functions {f_A} over {A} belonging to some group {G} of matrices is the orbit of {f} under the group and is just a set of polynomials {O_G(f)}. The polynomial {f} is invariant under the action by {G} if {f_A} is always the same polynomial as {f}, i.e., {O_G(f) = f}. For instance, if {G} is given by matrices f determinant 1 then the determinant polynomial is an invariant. It is likewise invariant under two-sided actions that also multiply by some appropriate {A'} on the right.

Note that if {f} is a polynomial of degree {d} with zero constant term then so is {f_A}. We can identify such polynomials with their vectors of coefficients. It is possible for {O_G(f)} to have members that are arbitrarily close to the zero vector. Then {f} is said to belong to the null cone {N_G}. How hard is it to decide this? The results in the above papers, together with this paper by Peter Bürgisser, Matthias Christandl, Ketan Mulmuley, and Michael Walter, place the problem into {\mathsf{NP \cap co}\text{-}\mathsf{NP}}. All of them are among the speakers at this workshop.

The {\mathsf{NP}} side is actuated by the Hilbert-Mumford criterion which says that when {f \in N_G} there is always a subgroup of matrices {A_t} with the single parameter {t} such that {\lim_{t \rightarrow \infty} A_t f \rightarrow 0}. The {\mathsf{\mathsf{co}\text{-}NP}} side uses a newly-amplified duality creating a moment function {\mu_G} such that whenever {f} is not in the null cone there is an {f' \in O_G(f)} such that {\mu_G(f') = 0}. The same complexity classification applies to membership in associated moment polytopes. Thus we have a strain of new and interesting intermediate problems to consider.

The Talks

Mumford amplified Hilbert’s work on constructivizing his own results into what is now called geometric invariant theory. That in turn underpins Ketan Mulmuley and Milind Sohoni’s geometric complexity theory, and we hasten to note that this is on the docket of today’s talks at 3:45pm. Here is the whole list of talks from this afternoon on:

  • Today 2:00–3:15pm: “Algorithmic invariant theory,” Visu Makam (University of Michigan).

  • 3:45–5:00pm: “Geometric complexity theory (GCT): Algorithmic challenges in invariant theory,” Ketan D. Mulmuley (University of Chicago).

  • Thursday 9:30–10:45am: “The dynamics of regularized flows on convex bodies,” James Lee (University of Washington).

  • 11:15 a.m—12:30pm:”An Introduction to Geodesic Convexity,” Nisheeth Vishnoi (EPFL).

  • 2:00—3:15pm: “Operator Scaling via Geodesically Convex Optimization, Invariant Theory and Polynomial Identity Testing,” Yuanzhi Li (Princeton University).

  • 3:45—5:00 p.m: “Solution to the Paulsen problem (via operator scaling),” Lap Chi Lau (University of Waterloo).

  • Friday 9:00—10:15am (note earlier time): “Combinatorial methods for PIT (and ranks of matrix spaces),” Roy Meshulam (Technion).

  • 10:30–11:45am: “Capacities, Hyperbolicity, Submodularity and all the Jazz…” Leonid Gurvits (The City College of New York).

After lunch from 1:00pm on there will be a discussion of new directions and open problems, including some off-the-cuff short talks.

Abstracts of the talks are on this page, and again they are all being recorded and live-streamed here with free access. The recorded talks and other materials will be available later from the workshop page.

Open Problems

What will result from the cross-pollination of several fields? Already there has been much exciting work.

(This post may be extended with some more examples. Usually our posts are invariant but the timeframe makes this an exception. OK, here is one…)

We can give a small example of the flavor of what Avi and friends are up to. Suppose that we look at a {2 \times 2} matrix—for example, the following:

\displaystyle  \begin{pmatrix} a & b \\ c & d \end{pmatrix}

The question is: Can one change this matrix into a doubly-stochastic matrix with only re-scaling operations? These operations allow us to change any row (column) to a rescaled row (column). Thus the above matrix can be changed to

\displaystyle  \begin{pmatrix} \lambda a & \lambda b \\ c & d \end{pmatrix}

for any real {\lambda}. Also one can change any column, so one can next get to

\displaystyle  \begin{pmatrix} \lambda a & \lambda\lambda' b \\ c & \lambda' d \end{pmatrix}

for example. Here is a puzzle: Can we use these re-scaling operations to transform

\displaystyle  \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}

into a doubly-stochastic matrix? Ken posed this to me and my first thought was no—its is obviously impossible. Do you see why I thought this was true?

But I was wrong, as I often am. The matrix can be transformed to one that gets as close as you like to a doubly-stochastic matrix. Transform it as follows:

\displaystyle  \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}

becomes

\displaystyle  \begin{pmatrix} 1/10^{6} & 1/10^{6} \\ 0 & 1 \end{pmatrix}

which becomes

\displaystyle  \begin{pmatrix} 10^{6}\cdot 1/10^{6} & 1/10^{6} \\ 10^{6} \cdot 0 & 1 \end{pmatrix}.

This is

\displaystyle  \begin{pmatrix} 1 & 1/10^{6} \\ 0 & 1 \end{pmatrix}.

Pretty neat.

However, if the original matrix {A} is

\displaystyle  \begin{bmatrix} 1 & 1 & 1\\ 0 & 0 & 1\\ 0 & 0 & 1 \end{bmatrix}

then it really is impossible. Even if we zero-out the corner, the fact of two 1s below it (and to its left) will keep us bouncing back between 1/2 and 1.

There is a more fundamental way to express this impossibility. We can regard {A} as the matrix of a bipartite graph where the positive entries correspond to entries between the partitions and the 0s are non-edges. The relavant fact noted by Marshall Hall is that the scaling is possible if and only if the graph has a perfect matching. For this {A} it does not, so the scaling is impossible. Extensions of this idea of a “Hall blocker” have figured into several talks—and the idea of blockers in general plays into the concept of obstructions as used in geometric complexity theory.

[added note about recordings becoming available afterward]