Bigness (Part 3)
Azimuth 2020-04-14
Last time I talked about a 200-terabyte proof, completed in 2016, that the journal Nature called the “largest ever”. But Michael Nielsen pointed me to a vastly longer proof from 2012. This one required about 60,000,000 terabytes—that is, 60 exabytes! So, they were unable to store the whole proof.
Let’s pause to contemplate what 60 exabytes amounts to. It’s been estimated that all words ever spoken by human beings, if transcribed into text, would take just 5 exabytes to store. As of 2020, the genome of each human on Earth would take about 10 exabytes to store (if one foolishly didn’t do any data compression). On the other hand, in 2011 the total capacity of all hard drives sold by Seagate Technology was 330 exabytes.
So what proof was 60 exabytes long?
A semigroup is a set with a binary associative operation, which we write as multiplication. We say two semigroups and are isomorphic if there is a bijection such that
for all , and anti-isomorphic if there’s a bijection with
for all . We say two semigroups are equivalent if they are isomorphic or anti-isomorphic.
In 2012, Andreas Distler, Chris Jefferson, Tom Kelsey, and Lars Kottho showed that up to equivalence, there are
semigroups with 10 elements. That’s the fact that took 63 exabytes to establish! They also checked their method against previous calculations of the number of semigroups with 7, 8, or 9 elements.
Upon hearing of the much shorter proof called the “largest ever” by Nature, Chris Jefferson wrote:
Darn, I had no idea one could get into the media with this kind of stuff. I had a much larger “proof”, where we didn’t bother storing all the details, in which we enumerated 718,981,858,383,872 semigroups, towards counting the semigroups of size 10.
Uncompressed, it would have been about 63,000 terabytes just for the semigroups, and about a thousand times that to store the “proof”, which is just the tree of the search.
Of course, it would have compressed extremely well, but also I’m not sure it would have had any value, you could rebuild the search tree much faster than you could read it from a disc, and if anyone re-verified our calculation I would prefer they did it by a slightly different search, which would give us much better guarantees of correctness.
As I mentioned, his team found a total of 12,418,001,077,381,302,684 semigroups of size 10. But they took advantage of a formula, also published in 2012, that counts most of the semigroups of a given size, namely those that are ‘nilpotent of degree 3’. Thus, they only had to find 718,981,858,383,872 by a brute force search, a mere 0.006% of the total:
• Andreas Distler, Chris Jefferson, Tom Kelsey, and Lars Kottho, The semigroups of order 10, in Principles and Practice of Constraint Programming, Springer Lecture Notes in Computer Science 7514, Springer, Berlin, 2012, pp. 883–899.
What does it mean for a semigroup to be nilpotent of degree 3? It means that if you multiply any three elements of the semigroup you always get the same answer! Here’s the paper that showed how to count those:
• Andreas Distler and James D. Mitchell, The number of nilpotent semigroups of degree 3, Electron. J. Combin. 19 (2012) 51.
It’s a piece of semigroup folklore that as the size approaches infinity, the fraction of semigroups of that size that are nilpotent of degree 3 approaches 1.
Puzzle. Can you prove this?
Puzzle. Assuming the folklore is true, what’s the rough growth rate of the number of semigroups with elements, counted up to equivalence? The formulas given by Distler and Mitchell are very complicated, but they may have simple asymptotics.
Here they count the number of semigroups that are nilpotent of degree 3 with various numbers of elements, up to equivalence: