The Lemma Cited From Burnside

Gödel’s Lost Letter and P=NP 2018-03-12

How might it be applied in complexity theory?

St. Andrews history source

William Burnside was a well-known researcher into the early theory of finite groups.

Today Ken and I thought we would talk about one of his most-cited results—a result that is really due to others.

This happens all the time in mathematics. In this case Burnside wrote an important book on finite group theory and included a lemma that is called various things. It is sometimes named for Augustin-Louis Cauchy or Ferdinand Georg Frobenius, or called, “the Lemma that is not Burnside’s.” The lemma was incorrectly attributed to Burnside because he proved it in his 1897 book Theory of Groups of Finite Order. His title for the last two sections of his eighth chapter was the statement of the lemma:

Number of symbols left unchanged by all the substitutions of a group is the product of the order of the group and the number of the sets in which the symbols are interchanged transitively.

After his proof in section 118, he began section 119 by writing:

119. The formula just obtained is the first of a series of similar formulae, due to Herr Frobenius,* which are capable of many useful applications.

The * citation was to a one-page paper by Frobenius in Crelle’s Journal in 1887. The formula itself appeared in an 1845 paper by Cauchy. In his 1911 second edition, Burnside stated it in section 145 (page 191) as “Theorem VII” where he adjoined a twin statement about summing the squares of the numbers of fixed elements.

The book has no instance of the word “lemma,” which the first edition used only in section 77. There is no nearby mention of Cauchy or Frobenius and neither the 1887 nor the 1845 paper is cited anywhere. Instead, Burnside’s next mention of Frobenius comes on page 269 astride five new chapters on representation theory that were his great thrust in the expanded edition. He used a big * footnote stretching across two pages to cite multiple works by Frobenius en banc:

* The theory of the representation of a group of finite order as a group of linear substitutions was largely, and the allied theory of group-characteristics was entirely, originated by Prof. Frobenius. …

That Burnside gave a tandem statement supports the position that he felt it was all well-known research, but perhaps what the online Encyclopedia of Mathematics calls the “mysterious dropping” of Frobenius owed as much to his sweep toward this grand encomium.

Burnside is more properly known for actual beautiful results such as the important {p^{a}q^{b}} theorem, which shows that every finite group whose order has only two prime divisors is solvable. The original proof relied heavily on representation theory, and it took many years to get a proof that avoided needing this machinery—see this for the theorem and a short historical note.

Then there is Burnside’s conjecture, which lasted for six-plus decades before being refuted in 1964, but which still has viable forms and impacts as we covered before.

But what of the Lemma? The problem was less Burnside’s lack of citation and more the habit of others citing it from Burnside. So we propose calling it the Lemma Cited From Burnside (LCFB). The initials can also stand for “Lemma of Cauchy-Frobenius per Burnside.”

The LCFB

An action of a group {G} on a set {X} is a mapping {\phi} from {G} to permutations of {X} that multiplicative: {\phi(gh) = \phi(g)\phi(h)}. That is, the action is a subgroup of the symmetric group of {X} that is a homomorphic image of {G}, but the action retains information about which homomorphism was used.

An action induces—and is equivalently definable as—a function

\displaystyle  \Phi: G \times X \rightarrow X

with the property that for all {g,h \in G} and {x \in X}, {\Phi(gh,x) = \Phi(g,\Phi(h,x))}. The orbit of {x} under the action is the set of values {\Phi(g,x)} over all {g \in G}.

This is all made more visual if we write {g\cdot x} for {\Phi(g,x)}. Especially when the action is faithful, meaning {\phi} is 1-to-1, we can picture the elements of {G} as mapping {x}-es directly. Then the orbit is {G\cdot x = \{g\cdot x: g \in G\}}. Some questions to ask are:

  • How many elements {x} are fixed by a given {g}, namely {g\cdot x = x}?

  • How many elements are fixed by the action on average?

  • How many orbits does the action have?

The latter two questions may seem unrelated. But the LCFB says that their answers are the same:

Lemma 1 Let {G} act on the set {X}. Then

\displaystyle  \#\mathsf{orbits}_G(X) = \frac{1}{|G|} \sum_{g \in G} |\mathsf{Fixed}(g)|.

Balance in Orbits

The key idea of the proof is that orbits are balanced. For any {x}, define its stabilizer by {\mathsf{Stab}_G(x) = \{g \in G: g\cdot x = x\}} and note that this forms a subgroup of {G}. The orbit-stabilizer theorem states:

Theorem 2 Let {G} act on the set {X} and let {x \in X}. Then

\displaystyle  |G\cdot x|\cdot |\mathsf{Stab}_G(x)| = |G|.

It follows right away that for all {x,y} in the orbit {{\cal O}}, {|\mathsf{Stab}_G(x)| = |\mathsf{Stab}_G(y)|}; call this {s_{\cal O}}. The proof of Lemma 1 comes quickly too:

\displaystyle  \begin{array}{rcl}  \sum_{g \in G,x \in X} [g\cdot x = x] &=& \sum_{\text{orbits }{\cal O}}\sum_{x \in {\cal O}}\sum_{g \in G}[g\cdot x = x]\\ &=& \sum_{\cal O}\sum_{x \in {\cal O}}|\mathsf{Stab}_G(x)|\\ &=& \sum_{\cal O} |{\cal O}|\cdot s_{\cal O}\\ &=& \sum_{\cal O} |G| = |G|\cdot\#\mathsf{orbits}_G(X), \end{array}

which is what we needed.

We have used a “Theorem” to prove a “Lemma.” Is the “Theorem” obvious? We note substantial discussion about that in a StackExchange post and links from it to a lengthy post by Tim Gowers. We will try our own explanation because it speaks right to aspects of the cycle structure of permutations that we are trying to quantify in new ways.

List out the orbit of {x} as {x = x_0,x_1,x_2,\dots,x_{m-1}}. For each {j}, pick an element {h_j} of {G} such that {h_j \cdot x = x_j}. And list out {\mathsf{Stab}_G(x)} as {g_0,g_1,\dots,g_{n-1}}. We claim that

\displaystyle  h_j g_i = h_\ell g_k \implies i = k \wedge j = \ell.

If they are equal as elements of {G} then {h_j \cdot x = h_j g_i \cdot x = h_\ell g_k \cdot x = h_\ell \cdot x}, but by the choice of {h_j} and {h_\ell} this forces {\ell = j}. And {h_j g_i = h_j g_k \implies h_j^{-1} h_j g_i = h_j^{-1} h_j g_k \implies g_i = g_k}. Thus the elements {h_j g_i} are all different.

Now consider any {h \in G}. There is a {j} such that {h \cdot x = x_j = h_j \cdot x}. Now put {g = h_j^{-1} h}. Then

\displaystyle  g \cdot x = h_j^{-1} h \cdot x = h_j^{-1} \cdot x_j = x.

Therefore {g} belongs to {\mathsf{Stab}_G(x)},so it equals {g_i} for some {i}. Thus {g_i = h_j^{-1} h}, so we have written {h = h_j g_i}. Thus the elements {h_j g_i} run through {G} exactly once. It follows further that for each {j} there are exactly {n} elements {h} such that {h \cdot x = x_j}, namely those {h_j g_i} for different {i}. It follows that {m\times n = |G|}, which proves the theorem.

Note that we seem to have avoided appealing to the concepts of order (of an element or group) or quotient that go into statements of Lagrange’s theorem. Of course they are present, but we used {X} as a screen to hide them. We didn’t even have to exhibit a 1-1 correspondence between {\mathsf{Stab}_G(x)} and {\mathsf{Stab}_G(x_j)}. It all simply flows from the axiom of inverses and multiplicativeness of the action which in particular gave us {h_j^{-1} \cdot x_j = x}. There is no intermediate notion of how a group can act on a set—it must have perfect balance in orbits.

Applications

Here is an example that also shows the difference between the physical and algebraic nature of an action. Consider regular {n}-gons whose edges are colored one of {k} colors. Equivalently, we can consider them to have {n} triangular facets colored the same front and back. The physical actions are rotating the polygon right by {360/n} degrees and flipping it over. These generate the dihedral group {D_n}. The number of orbits of the action of {D_n} on the set {X} of tile colorings tells us how many tile types there are.

The LCFB tells us to count the colorings fixed by each group element. Consider {n = 4} and {k = 2}, that is, black and red colorings of a square. This gives {|D_4 = 8|} permutations and {2^4 = 16} colorings. We count as follows:

  • The identity fixes all 16 colorings.

  • The 90-degrees-right ({R}) and 90-degrees-left ({L}) rotations fix only the two monochrome tiles.

  • The 180-degree rotation fixes those two plus the two with diagonals of opposite colors.

  • The flip shown at top below fixes all tiles that have facets 2 and 4 the same color, 8 in all, including the coloring shown.

  • The flip plus 180-degree rotation fixes when 1 and 3 have the same color, giving another 8.

  • The flip plus {R} rotation fixes the 4 colorings with 1 and 2 the same color and 3 and 4 the same color. The flip plus {L} does likewise for 1 and 4 vis-à-vis 2 and 3.

We get {16 + 2\times 2 + 4 + 8 + 8 + 2\times 4 = 48} fixed items, hence the LCFB gives {48/8 = 6} orbits.

           Two different representations of a flip element in {D_4}.

The bottom part of our figure shows how we could have oriented the square like a diamond before doing the mirror-image flip. Now, however, the coloring shown is not fixed, and only 4 not 8 colorings are preserved. Is this an inconsistency? Although the flip is physically the same, it is algebraically different. Represented as a permutation in cycle form, the original flip was {(1)(2\;\,4)(3)}, whereas the latter flip is {(1\;\,4)(2\;\,3)}. The latter flip is equal to following the former flip with the {L} rotation, which indeed fixed 4 colorings under the former representation of the action of {D_n}.

This suggests that the cycle structure of the permutations used in the action is what matters. George Pólya observed this to give a generating-function form, which Nicolaas de Bruijn refined and extended, and which allows for weighting the elements. In our example, for any {n}, it gives a polynomial that can be used to compute the number of {n}-gon tile types for any number {k} of colors, without repeating the inspection of fixed elements. The polynomial involves variables {a_i} raised to the power of the number of cycles of length {i} in a given permutation. Each permutation gives a monomial and those are summed up and divided by the size of the group. In the case of {D_4} the polynomial is

\displaystyle  Z(D_4) = \frac{1}{8}(a_1^4 + 2a_4 + 3a_2^2 + 2a_1^2 a_2).

If we simply substitute each {a_i} by {k} then we get the number of orbits; for instance, square tiles with 3 colors give 21 orbits. But with colors {b,r,g} we can substitute {a_i} by {(b^i + r^i + g^i)} and then the resulting {Z(b,r,g)} gives more information. Assigning other weights besides {(1,1,1)} to {(b,r,g)} achieves other counting tasks. This article by Nick Baxter serves to introduce a detailed survey by Mollee Huisinga titled “Pólya’s Counting Theory,” which is great for further applications including classifying molecular structures.

We are interested in cases involving succinctly-described permutations {\pi} of exponential-sized sets {X}. That the general formula for {Z(D_n)} involves the Euler totient function not only of {n} but also divisors of {n} hints at one source of such cases. In our setting, whether {\pi} fixes an element {x \in X} is decidable in polynomial time, but whether {x} and {y} belong to different orbits—and other predicates about orbits—may not be. Thus the LCFB is instrumental to getting the orbit-counting problem and other functions into {\mathsf{\#P}}.

Open Problems

What applications of the Lemma Cited From Burnside have been noted in complexity theory? We can ask this more generally about applications of the combinatorial double-counting trick, where one way of doing the counting looks hopeless but the other way at least gives a {\mathsf{\#P}} function or difference of {\mathsf{\#P}} functions.