Shelves and the Infinite

Infinity is a very strange concept. Like alien spores floating down from the sky, large infinities can come down and contaminate the study of questions about ordinary finite numbers!﻿ Here’s an example.

A shelf is a set with a binary operation that distributes over itself:

There are lots of examples, the simplest being any group, where we define

They have a nice connection to knot theory, which you can see here if you think hard:

My former student Alissa Crans, who invented the term ‘shelf’, has written a lot about them, starting here:

I could tell you a long and entertaining story about this, including the tale of how shelves got their name. But instead I want to talk about something far more peculiar, which I understand much less well. There’s a strange connection between shelves, extremely large infinities, and extremely large finite numbers! It was first noticed by a logician named Richard Laver in the late 1980s, and it’s been developed further by Randall Dougherty.

It goes like this. For each there’s a unique shelf structure on the numbers such that

So, the elements of our shelf are

and so on, until we get to

However, we can now calculate

and so on. You should try it yourself for a simple example! You’ll need to use the self-distributive law. It’s quite an experience.

You’ll get a list of numbers, but this list will not contain all the numbers Instead, it will repeat with some period

I’ll say more about this kind of cardinal later. But, this is not the only case where a ‘large cardinal axiom’ has consequences for down-to-earth math, like the behavior of some sequence that you can define using simple rules.

On the other hand, Randall Dougherty has proved a lower bound on how far you have to go out in this sequence to reach the number 32.

And, it’s an incomprehensibly large number!

The third Ackermann function is roughly 2 to the th power. The fourth Ackermann function is roughly 2 raised to itself times:

And so on: each Ackermann function is defined by iterating the previous one.

Dougherty showed that for the sequence to reach 32, you have to go at least

This is an insanely large number!

I should emphasize that if we use just the ordinary axioms of set theory, the ZFC axioms, nobody has proved that the sequence ever reaches 32. Neither is it known that this is unprovable if we only use ZFC.

So, what we’ve got here is a very slowly growing sequence… which is easy to define but grows so slowly that (so far) mathematicians need new axioms of set theory to prove it goes to infinity, or even reaches 32.

I should admit that my definition of the Ackermann function is rough. In reality it’s defined like this:

And if you work this out, you’ll find it’s a bit annoying. Somehow the number 3 sneaks in:

where means raised to itself times,

where means with the number repeated times, and so on.

However, these irritating 3’s scarcely matter, since Dougherty’s number is so large… and I believe he could have gotten an even larger upper bound if he wanted.

Perhaps I’ll wrap up by saying very roughly what an I3 rank-into-rank cardinal is.

In set theory the universe of all sets is built up in stages. These stages are called the von Neumann hierarchy. The lowest stage has nothing in it:

Each successive stage is defined like this:

where is the the power set of that is, the set of all subsets of For ‘limit ordinals’, that is, ordinals that aren’t of the form we define

Very roughly, this means the infinity is so huge that the collection of sets that can be built by this stage can mapped into itself, in a one-to-one but not onto way, into a smaller collection that’s indistinguishable from the original one when it comes to the validity of anything you can say about sets!

More precisely, a nontrivial elementary embedding of into itself is a one-to-one but not onto function

that preserves and reflects the validity of all statements in the language of set theory. That is: for any sentence in the language of set theory, this statement holds for sets if and only if holds.

I don’t know why, but an I3 rank-into-rank cardinal, if it’s even consistent to assume one exists, is known to be extraordinarily big. What I mean by this is that it automatically has a lot of other properties known to characterize large cardinals. It’s inaccessible (which is big) and ineffable (which is bigger), and measurable (which is bigger), and huge (which is even bigger), and so on.

How in the world is this related to shelves?

The point is that if

are elementary embeddings, we can apply to any set in But in set theory, functions are sets too: sets of ordered pairs So, is a set. It’s not an element of but all its subsets are, where So, we can define

Laver showed that this operation distributes over itself:

And, he showed that if we take one elementary embedding and let it generate a shelf by this this operation, we get the free shelf on one generator!

The shelf I started out describing, the numbers with

also has one generator namely the number 1. So, it’s a quotient of the free shelf on one generator by one relation, namely the above equation.

That’s about all I understand. I don’t understand how the existence of a nontrivial elementary embedding of into itself implies that the function goes to infinity, and I don’t understand Randall Dougherty’s lower bound on how far you need to go to reach For more, read these:

• Richard Laver, The left distributive law and the freeness of an algebra of elementary embeddings, Adv. Math.91 (1992), 209–231.

Related

This entry was posted on Friday, May 6th, 2016 at 6:40 am and is filed under mathematics. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

Post navigation

42 Responses to Shelves and the Infinite

I’m pretty sure “one-to-one” follows from “elementary embedding”; if we have x ∈ y \ z , we must have f(x) ∈ f(y) \ f(z)… but that’s just for the parsimoniously-inclined; some things it really is better to mention up-front.

And now, because echoes are fun to note, … you’ve made some remarks about what does and does not matter to the Ackerman function, and … they sound an AWFUL LOT like what does and does not matter to an I3 rank-into-rank cardinal: an elementary embedding misses some things, as does jumping to 2^(n+3) rather than 2^(n+3)-3; (or to 2^(n+3)-3 from 2^n … ), but the rank-into-rank cardinal at the top of it doesn’t care — but even more emphatically, even more precisely.

… Is it a good thing that the lower bound was A(n,A(m,A(m,k))) rather than A(A(A(m,k),m),n)?

… Is it a good thing that the lower bound was A(n,A(m,A(m,k))) rather than A(A(A(m,k),m),n)?

“Good thing”? I guess it depends on whether you like big numbers or small ones. Since I like big ones, I was sort of disappointed that the bound was of the form A(n,A(m,A(m,k))); this makes it much smaller.

People who like big numbers may want to compare this post to an earlier one:

That post mentions another very large number arising from a simple math problem; Harvey Friedman has proved an upper bound on that given by

This uses his improved version of the Ackermann function, which obeys

and

where we use s.

So, Harvey Friedman’s bound is really big, since we’re iterating a process of making a function much bigger times. Unfortunately it’s an upper bound, just like Graham’s number, so it might become much smaller when we learn more, just as with the upper bound that led to Graham’s number.

gets you the fast-growing hierarchy, which doesn’t resolve into Knuth arrows quite as nicely.

Concerning Friedman’s function, I read over Friedman’s paper not too long ago, and discovered that the same argument that he used to prove could be used to give a lower bound for of roughly . So for example is already much more than the given lower bound of about .

As time goes on, what were once considered really enormous cardinals were later trumped by even bigger ones. This is the reason for the title of Kanamori’s book The Higher Infinite, which at first sounds pompous. (It’s wonderful book to carry around if you want to impress other mathematicians: none of those puny lower infinities for me, please!)

you’ll see infinities grouped into kinds. The I3 rank-into-rank cardinals are in the upper attic:

Welcome to the upper attic, the transfinite realm of large cardinals, the higher infinite, carrying us upward from the merely inaccessible and indescribable to the subtle and endlessly extendible concepts beyond, towards the calamity of inconsistency.

If you go up there you’ll see ‘remarkable’ cardinals, ‘ethereal’ cardinals, ‘indescribable’ cardinals and more.

The only cardinals I know bigger than the I3 rank-into-rank cardinals are the I2, I1 and I0 rank-into-rank cardinals. These are all watered-down versions of the Reinhardt cardinals, which were proved to be inconsistent with the axiom of choice thanks to Kunen’s inconsistency theorem.

I know you won’t follow all those links, since our lives are not infinite… so I’ll just say here: it’s inconsistent with the axiom of choice to assume there’s an elementary embedding of the whole universe into itself! So, these watered-down cardinals assume there’s an elementary embedding of some ‘stage’ of building the universe into itself.

In short: if the axiom of choice holds, the universe of all sets is too big to fit inside a copy of itself… or, if you prefer, it’s too small to contain a copy of itself!

There’s a subtle point at the end. In general, one large cardinal axiom A can be higher in consistency strength than another B without A actually implying B, and I think that’s the case here. Even though the existence of an inaccessible cardinal is much lower in consistency strength than the existence of an I3 cardinal, I believe that if there is an elementary embedding , then is not inaccessible. After all, if were inaccessible, then would model ZFC, and would be a non-identity elementary embedding from the universe to itself in this model of ZFC. But Kunen showed this is inconsistent. The I3 axiom is supposed to be a weakening of Reinhardt’s axiom saying that there’s a nontrivial elementary embedding from the universe to itself.

Oh, wow! Thanks—as you can see, I’m just learning this stuff. So an I3 rank-into-rank cardinal can’t be measurable either, since measurable cardinals are inaccessible. In what sense is an I3 cardinal ‘large’ then, apart from consistency strength? If we could prove it’s larger than some measurable cardinal, that would be fine.

I’m purely going by what I’m reading on the internet at this point, but when talking about rank-into-rank cardinals there are really two cardinals that might be considered to be “the large cardinal”. If there is a non-identity elementary embedding , then mighbt be considered a large cardinal even though it is not inaccessible. Or the critical point of may be considered a large cardinal. This is the smallest cardinal such that . It turns out that this critical point of is inaccessible, measurable, huge, etc.

The rank-into-rank cardinals and their relation with finite self-distributive algebras are truly majestic. While exciting results about the classical Laver tables have been proven in the 1990’s, unfortunately, from the late 1990’s until the 2010’s no one has proven any interesting results about the classical Laver tables, but in the last couple years a couple papers about the classical Laver tables have been published. These papers include Victoria Lebed and Patrick Dehornoy’s paper which calculates the cocycle groups of the classical Laver tables and give some invariants on positive braids. Lebed and Dehornoy’s work is a good step towards possibly applying Laver tables and hence large cardinals to knot theory and braid theory. I am looking forward to when set theory has more applications in these subject areas and other subject areas. Furthermore, I am looking forward to more results about finite structures or results in knot theory or other areas whose only known proof relies on very large cardinal axioms as hypotheses.

John Baez. Yes. The first rank-into-rank cardinal is much larger than the first measurable cardinal. However, when people talk about rank-into-rank embeddings there are two cardinals in question, namely and the critical point which is the first ordinal with . The cardinal is always a singular strong limit cardinal of countable cofinality, so is not even inaccessible. On the other hand, the cardinal is measurable (and huge and much more). The term “rank-into-rank cardinal” could also refer to the critical point as opposed to . Most large cardinal axioms (like measurable, supercompact, and so forth) refer to the critical points of certain elementary embeddings rather than other cardinals involved, so when people hear the term rank-into-rank cardinal they may think about the critical point rather than (unless defined otherwise of course).

On another note, the best way to understand the seeming discrepancy between the consistency strength and size of various large cardinals is that cardinals higher in the large cardinal hierarchy imply the existence of very nice models which satisfy large cardinal axioms lower in the large cardinal hierarchy and these very nice models are in many cases initial segments of the universe where is a large cardinal. For example, the existence of a rank-into-rank embedding does not imply the existence of a strongly compact cardinal since the first strongly compact cardinal could be larger (in size) than the first rank-into-rank cardinal. However, if is a rank-into-rank embedding with critical point , then models “there are many strongly compact cardinals”.

Tim Campion. A cardinal with does not preclude the existence of a rank-into-rank embedding from to . In fact, there are elementary embeddings from models of ZFC into themselves at much lower levels of the large cardinal hierarchy. For example, the axiom (a large cardinal axiom between indescribable and measurable) is equivalent to saying there is an elementary embedding (recall that is the constructible universe which is the smallest inner model of ZFC). The reason why we can obtain such an elementary embedding from to is that implies that and also that is much smaller than . Therefore, even though we know that there is no elementary embedding from to , there could still be an elementary embedding from to since initial segments of the elementary embedding live outside of .

Furthermore, if with , then is actually an elementary substructure of . Since is inaccessible models ZFC so as well.

I don’t feel like trying to generate one of those P(n)’s, so I can’t be sure it can always be done in an algorithmic way… but if you promise me it can be, doesn’t that make it inevitable that for any n, there is a proof in PA that P(n) has a particular value? (Which consists of evaluating all those 2^n expressions and pointing out their period.)

(OTOH if you say it can’t be done by an algorithm (which is provably correct, by noncontroversial axioms), then I’m unlikely to consider P(n) well-defined.)

Thanks for the algorithm, Layra! I’m too sleepy to think about it (it’s 4:17 am here and I just woke up from a nightmare in which my dead father turned out to be living at the bottom of a concrete pit), so I just fixed your LaTeX. When you post comments here using LaTeX, take a look at the instructions that appear above the box you type into. They say:

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word ‘latex’ comes right after the first dollar sign, with a space after it.

I don’t feel like trying to generate one of those P(n)’s, so I can’t be sure it can always be done in an algorithmic way…

I tried computing one of the first ones, in my head, and it was quite far-out. I recommend giving it a try. However, I have the feeling that it can be done algorithmically. There’s a MathOverflow question about this:

A Laver table is the multiplication table for in one of the shelves that I was talking about.

Answers to this question indicate that currently the largest Laver table that’s been computed is for If this is described as a matrix of numbers between and then it would take quite a lot of memory to store all these numbers even if the computation itself is easy. I don’t know if anybody is able to compute P(n), namely the period of the first row of the table, more efficiently than working out the whole table! But I get the impression that the tables are perfectly computable, just tiring to compute.

but if you promise me it can be, doesn’t that make it inevitable that for any n, there is a proof in PA that P(n) has a particular value?

Yes, that would be inevitable.

However, it would not be inevitable that PA can prove “P(n) goes to infinity as n → ∞”, which is one of the facts that’s been proved using ZFC and a large cardinal axiom.

If ZFC plus this large cardinal axiom is consistent, then we know there must exist N such that P(N) = 32, and thus there must be a proof in PA that P(N) = 32 for this N.

However, it will then also be true that N ≥ A(9,A(8,A(8,255))). So, it’s conceivable that the shortest proof in PA that P(N) = 32 requires calculating a Laver table that is of size A(9,A(8,A(8,255))) × A(9,A(8,A(8,255))). In this case, we’ll never actually find that proof!

Of course, we should hope that there’s a shorter proof, and try to find it. Indeed we should try to find a proof in PA that P(n) → ∞ as n → ∞

If the large cardinal axiom is not consistent with ZFC, it could turn out that P(29) = 32. That would be amusing. (The long list of 16’s in my post was based on assuming the large cardinal axiom is consistent.)

Randall Dougherty showed that the smallest n with P(n) ≥ k grows very fast: faster than any primitive recursive function. (The Ackermann function is not primitive recursive.) There is a somewhat popular weakened form of Peano arithmetic called primitive recursive arithmetic, and Dougherty and Jech have shown that some basic properties of Laver tables can’t be proved in this system.

This might be a step toward proving that some properties of Laver tables can’t be proved in PA… but there are no results of that sort, yet. On the contrary, Dougherty has been trying to prove that P(n) → ∞ as n → ∞ in a system of arithmetic weaker than PA.

The sequence https://oeis.org/A098820 goes up to P(56) = 16. Do you have a link to Dougherty’s proof? (None of the three linked papers appeared to contain it.) Unless it depends on the large cardinal axiom, P(n) = 16 for all n for which we could possibly ever compute the full table, regardless of whether the large cardinal axiom is consistent with ZFC.

OTOH if you say it can’t be done by an algorithm (which is provably correct, by noncontroversial axioms), then I’m unlikely to consider P(n) well-defined.)

By the way, as you probably know, the Busy Beaver function is uncomputable, yet specific values of this function have been computed by utterly noncontroversial methods. So, we should be a bit careful. We can make up functions F(n) that are uncomputable, yet we can compute their value for all n < A(A(A(10,10),10),10).

But this is just a nitpick. I’m now convinced, by what Joseph van Name wrote, that P(n) is computable in the usual sense: there’s a program that computes its value for any n.

It’s actually very simple to see that the entire table is computable, granted the result that there is a unique table for any power-of-2 size. You just start with a table with empty values. You fill in the known values (defined by the n-triangle-1 = (n+1) mod 2^n relation).

Then you start making passes over all triples (a,b,c), inferring values of the triangle relation from the shelf axiom whenever one side of the equation has known values and the other side is unknown.

Make as many passes as you need, until you stop inferring new values. At that point the entire table must be filled in (otherwise there would not exist a unique shelf satisfying the constraints).

It’s not efficient, of course, but it is polynomial in 2^n (somewhere between (2^n)^3, which is the time each pass takes, and (2^n)^5, because you’ll never need more than (2^n)^2 passes, which is the number of inferences that need to be made.

If you just start working it out for arbitrary N = 2^n, you can see that it’s quite straightforwardly computable. Compute N \rhd i for i from 1 to N. Then compute N – 1 \rhd all i, then N – 2 \rhd all i, etc.

N \rhd i = i \forall i
N – 1 \rhd i = N \forall i
N – 2 \rhd i = N – 1 for odd i, N for even i
and so on.
One nicety: i \rhd N = N \forall i.

The gap I’m worried about is not the uniqueness. It’s whether the uniqueness implies that you can always make an inference directly from a single instance of the distributive law. That is, can you get a Sudoku-style situation where there remain unknown values, and you can’t infer any new ones directly from any single instance of the distributive law based on known values, but nevertheless only one way of filling in the values makes the whole table satisfy the law?

I have a vague memory that that may not be possible in equational logic, but I can’t quite remember why. Maybe you can use completeness of first-order logic plus elimination of quantifiers, or maybe you can get it by reasoning about the free object satisfying the equational relations, but I just can’t immediately fill in the details.

But anyway, it’s not important except to see whether my polynomial-time algo works. If all you want is computability, you can easily do that (in super-exponential time) by exhaustively enumerating all possible tables, and seeing which one satisfies the constraints.

The largest classical Laver table computed is actually . The 48-th table was computed by Dougherty and the algorithm was originally described in Dougherty’s paper here. With today’s technology I could imagine that one could compute if one has access to a sufficiently powerful computer. One can compute the classical Laver tables up to the 48-th table on your computer here at my website.

Most of the notions concerning Laver tables can be studied in a purely algebraic manner including seemingly set theoretic notions such as the notion of a critical point. In particular, the fact that “if P(n)=32 then n must be extremely large” is a fact that can be proven in a purely algebraic manner without any reference to large cardinals. Therefore, even if large cardinals are inconsistent, an instance n of where P(n)=32 must still be at least Ack(9,Ack(8,Ack(8,255))). Thus, the long list of 16’s is a true statement even in Peano Arithmetic.

Yes. Dougherty’s proof that if P(n)=32 then n is big mentioned in this paper does not necessarily need to use any set theory. Dougherty’s proof extensively uses properties of critical points of elementary embeddings, but these properties of critical points can be proven algebraically without any reference to set theory. If , then the critical point can be defined algebraically as . The fact that the critical points of algebras of elementary embeddings can be studied algebraically was originally known by Laver in the early 1990’s. The translation between set theoretic notions and algebraic notions has been discussed in Dehornoy’s book Braids and Self-Distributivity in Chapter 13.

Hmm, the papers you link to do not seem to say anything about when the period of a Laver table reaches 32. What facts about F(n) (the number of critical points of members of lying below ) give us lower bounds for when the periods of Laver tables reach 2^n?

I was thinking about why large cardinal axioms have effect in number theory and also why complex analysis is so powerful a number theoretic tool. I mote that lots of free things and initial things are countable because of the countability of the languages we use to describe them, When we do the sort of linguistic gymnastics required to utter a large cardinal axiom we fold up the language in increasingly complicated ways. This gives rise to ordinals in proof theory. So perhaps it is the very large countable ordinals that we should be discussing. https://en.wikipedia.org/wiki/Large_countable_ordinal

There are fascinating relations between large cardinals, large countable ordinals and large finite numbers. This is one of the things I’d like to understand better. Since everything we actually do involves writing finite symbol strings, it’s perhaps not so surprising, but still, getting a really clear grip on it seems hard.

How To Write Math Here:

You need the word 'latex' right after the first dollar sign, and it needs a space after it. Double dollar signs don't work, and other limitations apply, some described here. You can't preview comments here, but I'm happy to fix errors.