Pure sets

Motivation

In material set theory, there is an intuitive conception of what a set is, which may be stated informally as follows: a set is a collection of sets. Actually, it is possible to have urelements in a material set theory (such as ZFA), although the most common axiom systems do not allow this; in any case we can say that a pure set is a collection of pure sets.

The primary motivation for structural set theory is that this conception of a set is not needed in ordinary mathematics; it is sufficient to characterise the category of sets (although a structural set theory can also be described in ways other than category-theoretic). However, material set theory is itself part of mathematics, and we may want to describe the material notion of a pure set in structural terms.

There is also a very practical point to this exercise: the translation between material and structural set theories. Any model of a material set theory is already a model of a corresponding structural set theory, but we go through the yoga below to construct a model of a material set theory out of a model of a structural set theory. In particular, we must do this to prove that two set theories (one material and one structural) are equiconsistent.

Idea

Taking a structural notion of set as granted, we wish to define a pure set to be a set of pure sets. On the face of it, this is a circular definition, but like many such definitions, it can be made precise in (at least) two ways: recursively and corecursively.

The ‘recursive’ meaning is that all pure sets must be constructed, starting from nothing, as sets of other previously constructed pure sets. This results in the well-founded pure sets. Thus, at first, the only well-founded set possible is the empty set∅={}\empty = \{ \}, the set of no pure sets. Once you have that, you can form ⋆={∅}\star = \{\empty\}, then {⋆}\{\star\} and {∅,⋆}\{\empty,\star\}, and so on. In this way we can obtain at least all hereditarily finite sets; if we use an axiom of infinity as well, we can jump to the countably infinite set of all hereditarily finite sets and continue from there.

The ‘corecursive’ meaning is that any pure set can be deconstructed into a set of other pure sets, and every possible such deconstruction defines a unique pure set. In this way we obtain not just well-founded pure sets but also ill-founded pure sets. (The inclusion of well-founded sets among the ill-founded sets is an example of the red herring principle.) Possible examples of non-well-founded sets include a set •\bullet such that •={•}\bullet = \{\bullet\} (a suggestive model for the point), or sets AA and BB such that A={B}A = \{B\} and B={∅,A}B = \{\empty, A\}.

If the foundation is ‘material’ or ‘membership-based,’ such as ZFC, then the elements of a set can, in fact, be other sets. Therefore, in this case we are defining (either recursively or corecursively) an adjective “pure” that can be applied to the noun “set:” a set is pure if at no point in its construction or deconstruction into elements do we encounter anything that is not a pure set.

In the most common material set theories, such as ZFC, all sets are pure, since the only ‘things’ the theory deals with (hence the only things that can be elements of sets) are sets. However, there are easy modifications of these theories that allow ‘atoms’ or urelements that are not sets, and in this case the pure sets will be those that ‘hereditarily’ contain no atoms. Many common material set theories (starting with von Neumann 1925, Zermelo 1930) also include an axiom of foundation asserting that all (pure) sets are well-founded; the dual axiom of anti-foundation (due to Aczel) allows and ‘tames’ the ill-founded sets.

If the foundation is ‘structural’ or ‘categorial’, such as ETCS or SEAR, then the elements of a set cannot be other sets. Thus, in this case we are defining a single noun “pure set,” with no a priori relation to the structural notion of “set” that occurs in the foundational theory. A pure set, according to this definition, is a set equipped with structure assigning to each of its elements another pure set (interpreted either recursively or corecursively).

From this point of view, the definition of pure set provides a construction of a model of material set theory within a model of structural set theory: a global relation of ∈\in can be defined between pure sets which will satisfy the axioms of some material set theory. (Of course, the sets in a model of material set theory always model a structural set theory, so this is the ‘difficult’ direction of the equivalence between the two types of set theory.) Whether we use the recursive or corecursive definition determines whether the resulting material set theory will satisfy the axiom of foundation or the axiom of anti-foundation. Note that using these two choices, the composite operation “material →\to structural →\to material” reduces to the standard proofs of the relative consistency of the axioms of foundation and anti-foundation, since the passage from material to structural requires neither one.

Formalisation in material set theory

In material set theory without urelements, every set is a pure set. If there are urelements, so that not all sets are pure, then it is easy to define the class of pure sets as follows:

Definition

A set xx is pure if given any sequence xn∈xn−1∈…∈x1∈x0=xx_n \in x_{n-1} \in \dots \in x_1 \in x_0 = x, all of the xix_i are sets.

Whether this produces the well-founded pure sets or the ill-founded ones depends on whether the axiom of foundation is satisfied in the ambient set theory.

Formalization in structural set theory

In a structural set theory like ETCS or SEAR, we can model a pure set by a graph describing its hereditary membership relation.

One way to state the basic theoretical idea is that the class of well-founded sets is the initial algebra of the covariant power set functor, while the class of ill-founded sets is the terminal coalgebra of the same functor. Of course, neither of these algebras exists as a set, since this would violate Cantor's theorem, but we can still describe what their elements would be like. We can also define these algebras as discretelarge categories, or as proper classes in a structural set-class theory such as algebraic set theory.

The discussion which follows is phrased informally, like most mathematics. However, it is purely structural and can be interpreted in any structural set theory. For instance, the definition below of a “graph” as a set with a binary relation should be formalized as a set GG together with an injection R↪N×NR \hookrightarrow N\times N.

Membership graphs

To describe a pure set, we must give all the elements of that set, each of which is a pure set and thus must have its own elements, which are also pure sets, and so on. A convenient way to “picture” a pure set is with a graph.

Definition

For the purposes of this page, a graph will mean a set GG of nodes equipped with a binary relation→\to on the nodes. A node ii is called a child of a node jj if i→ji \to j.

The idea of a graph-picture of pure sets is that the nodes represent pure sets, and we have i→ji\to j precisely when i∈ji\in j. Clearly the membership relations between any collection of pure sets should be representable by a graph. To what extent the converse holds, i.e. which graphs can be interpreted as membership-relations between pure sets, depends on whether the pure sets in question are well-founded or ill-founded. For purposes of drawing well-founded pure sets, only well-founded graphs should be used, whereas arbitrary graphs can be used to draw ill-founded sets. (In fact, one form of the axiom of anti-foundation is that any graph is the picture of some pure sets.)

Definition

A set SS of nodes in a graph is →\to-inductive if whenever all children of some node ii are in SS, then ii itself is in SS. A graph is well-founded if the only →\to-inductive set of nodes is the set of all nodes.

Assuming the principle of excluded middle, well-foundedness is equivalent to saying that the tree has no infinite paths.

If we want to use a graph to describe a specific pure set, then we should single out one particular node as representing that set. Moreover, it makes sense to demand that the graph contain no “superfluous” data other than what is necessary to describe the hereditary membership structure of the particular set in question.

Definition

A graph is pointed if it is equipped with a specified node ⊤\top called the root. A pointed graph is accessible if for every node xx, there exists a path x=x0→x1→…→xn=⊤x = x_0 \to x_1 \to \dots \to x_n =\top to the root; this is equivalent to saying that ⊤\top is a top element for the reflexive-transitive closure of →\to. An accessible pointed graph is abbreviated APG.

An APG gives all the data necessary to characterize a pure set XX, and the pure set in question is well-founded iff the APG is well-founded. We can think of the root as representing XX itself, the root's children as the elements of XX, the next level as those elements' elements, and so on.

If we want to think of these elements as pure sets themselves, then we can “externalize” them into separate APGs. Specifically, for any node zz in a graph GG, the subgraph consisting of all those nodes which admit some path to zz is an APG, which describes the pure set represented by zz; we write it as G/zG/z and call it the full subgraph of GG rooted at zz. If GG is an APG and zz is a child of the root, we say that G/zG/z is an immediate subgraph of GG. The immediate subgraphs of GG represent the pure sets that are the “elements” of the pure set represented by GG. Thus, the APG GG pictures the hierarchy of elements of XX, where each immediate subgraph G/zG/z corresponds to its root zz, and →\to is interpreted as membership.

Examples of APGs

Note that the nodes in most of the pictures below have labels, but these are for illustration only; they are not part of the structure defining the tree.

First, ∅={}\empty = \{ \} consists of nothing but a root:

∅ \empty

The set ⋆={∅}\star = \{\empty\} has the empty set attached to the root (which we draw at the top):

⋆↑∅ \array { \star \\ \uparrow \\ \empty }

Notice that ∅\empty is an immediate subgraph, so we can write ∅∈⋆\empty \in \star.

Then the two most obvious ways of representing the ordinal 22 as a pure set, namely 2Z={⋆}2_Z = \{\star\} and 2N={∅,⋆}2_N = \{\empty, \star\}, can be drawn as follows:

In both cases, the unique immediate subgraphs verify that •∈•\bullet \in \bullet.

Ambiguities

APGs are sufficient to describe all pure sets, but multiple APGs, even non-isomorphic ones, can represent the “same” pure set. Firstly, we have so far not eliminated the possibility of duplicate branches. For instance, the pure set 2Z2_Z can also be represented as {⋆,⋆}\{\star, \star\}:

It is easy to eliminate APGs with such “obviously redundant” branches.

Definition

An APG GG is rigid if whenever xx and yy are two children of the same node zz such that G/x≅G/yG/x \cong G/y, then in fact x=yx=y.

Assuming excluded middle, rigidity is equivalent to saying that any graph automorphism is the identity function. Note also that if two rigid APGs are isomorphic, they must be uniquely isomorphic. Thus, for rigid APGs, being isomorphic is a property, rather than structure.

Even among rigid APGs, however, there are are ambiguities. For example, the pure set 2N2_N can also be drawn by the APG

The issue here is that the “same” pure set can occur in more than one place in the membership hierarchy of a pure set, and we have to choose whether or not to identify them in the graph. There are three solutions to this problem:

Demand that duplicate occurrences of a pure set always be represented separately.

Demand that duplicate occurrences always be identified, wherever they occur.

Define a notion of “equivalence” between APGs which is looser than isomorphism.

These are of course related; the first two can be seen as specifying a particular isomorphism class within each of the equivalence classes defined by the third. In particular, they all result in equivalent notions of pure set, when they all apply. (However, in predicative set theories, it seems that only the first method is applicable.) We will consider them all in turn.

Membership trees

A natural way to ensure that the same node is never “shared” between two occurrences is to require that our APGs be trees.

Definition

An pointed graph is a tree if every node admits a unique path to the root. Equivalently, GG is a tree if its root ⊤\top is a terminal object of the quiver generated by the relation →\to.

A tree may or may not be well-founded. Note, though, that in a tree, the opposite relation←\leftarrow is automatically well-founded: each node has a unique natural numberheight, namely the length of its unique path to the root, so well-foundedness of ℕ\mathbb{N} implies well-foundedness of ←\leftarrow.

It now makes sense to define a pure set to be a rigid accessible pointed tree. The idea is that two such trees represent the same pure set iff they are isomorphic. We define “membership” G∈HG\in H between such trees to mean that GG is isomorphic to some immediate subtree of HH (that is, a tree H/zH/z where zz is a child of the root of HH).

For example, the tree representation of 2N2_N is the one we gave first:

A tree models a hereditarily finite set if every node has finitely many children; a tree models a well-founded hereditarily finite set if and only if it is finite and →\to is decidable. The relationship between these two facts and the no-infinite-path formulation of well-foundedness is a form of Kőnig's Lemma?.

Given these definitions, one can prove the various axioms of material set theory. Specifically, if ETCS holds for the structural sets (of nodes, etc) used in this definition, then BZC−\mathbf{BZC}^- (which is ZFC with only bounded separation and no axioms of replacement or foundation) holds for rigid trees, while BZC\mathbf{BZC} (BZC−\mathbf{BZC}^- together with foundation) holds for well-founded rigid trees. Similar results hold for constructive weakenings of ETCS and BZC−\mathbf{BZC}^-, since these definitions involve only function-sets and not power-sets. Likewise, we can strengthen both sides with axioms of replacement, collection, or separation, or by adding Grothendieck universes.

The argument for the axiom of extensionality is perhaps the most interesting; it goes as follows. If each immediate subtree G/xG/x of GG is isomorphic to an immediate subtree H/yH/y of HH and conversely, then yy and xx are mutually uniquely determined by rigidity, so we have a bijection between the children of the roots of GG and of HH. Since the isomorphisms G/x≅H/yG/x \cong H/y are also unique, again by rigidity, and each G/xG/x is disjoint from G/x′G/x' for x≠x′x\neq x', by the tree property, we can piece together these isomorphisms to define an isomorphism G≅HG\cong H. (In particular, the uniqueness of all these isomorphisms means that the axiom of choice is not needed.)

(Note, though, that this is only the “weak” version of extensionality. This is sufficient for well-founded sets, but we may need to strengthen rigidity somehow to obtain stronger notions of extensionality, which are generally more appropriate for ill-founded sets. More precisely, the well-founded rigid trees model the usual axioms of extensionality and foundation, while arbitrary rigid trees model not Aczel’s axiom of anti-foundation AFA, but its “Scott” variant, SAFA.)

Extensional graphs

The “tree” representation can be thought of as an unwrapped picture; we now consider a wrapped picture in which there is no duplication at all. The natural way to ensure this is to require our graphs to be extensional. For well-founded graphs, this is easy: a well-founded graph is extensional if for any nodes xx and yy such that for all zz, we have z→xz\to x iff z→yz\to y, then necessarily x=yx=y. For non-well-founded graphs, we need the notion of “strong extensionality” defined as follows.

Definition

If GG and HH are graphs, a bisimulation from GG to HH is a binary relation ∼\sim from GG to HH such that for any x∈Gx\in G and y∈Hy\in H with x∼yx\sim y, then:

For any a→xa\to x, there is some b∈Hb\in H with b→yb\to y and a∼ba\sim b, and

For any b→yb\to y, there is some a∈Ga\in G with a→xa\to x and a∼ba\sim b.

A graph GG is extensional if the largest bisimulation from GG to GG is the identity.

The idea is that a bisimulation identifies pairs of nodes which “represent the same pure set.” One can show that a well-founded graph is extensional in this sense iff it is extensional in the weaker sense given above. Note also that any extensional APG is necessarily rigid, since any isomorphism G/x≅G/yG/x \cong G/y induces a bisimulation.

Thus, it also makes sense to define a pure set to be an extensional APG. Two such APGs represent the same set iff they are isomorphic, and we define membership in the same way: G∈HG\in H if G≅H/xG\cong H/x for some child xx of the root of HH. Of course, one can again prove the axioms of material set theory from this definition. If we restrict to well-founded extensional APGs, we get the axiom of foundation, while if we allow all extensional APGs, we get Aczel’s axiom af anti-foundation (along with its strong version of the axiom of extensionality). Note that this definition is not predicative, since the notion of extensionality involves quantification over bisimulations, which are relations, i.e. subsets. Thus, it requires at least limited separation.

Now an extensional accessible graph models a hereditarily finite set iff the graph itself is finite and →\to is decidable.

Equivalence

Finally, we consider the third solution to the ambiguity problem, and its relation to the first two: defining a looser notion of equivalence between APGs. Recall the notion of bisimulation. We say that two APGs GG and HH are equivalent if there is a bisimulation ∼\sim from GG to HH such that ⊤∼⊤\top\sim\top. Note that if ∼\sim is a bisimulation from GG to HH such that x∼yx\sim y, then the APGs G/xG/x and H/yH/y are equivalent; this justifies the interpretation of a bisimulation.

We can now define a pure set to be an APG, with equality represented by equivalence in the above sense. Likewise, we define “membership” G∈HG\in H to mean that GG is equivalent to some immediate subgraph of HH. One can again prove the various axioms of material set theory with these definitions (either using well-founded APGs or arbitrary ones). As with extensional APGs, in this case we obtain the strong notion of extensionality (which is equivalent to the weak one for well-founded sets). This definition is also not predicative.

The relationship of this approach to the previous ones is as follows.

Every APG is equivalent to an extensional one. Namely, let ≈\approx be the maximal bisimulation on GG (the union of all bisimulations); it is an equivalence relation, and its quotient G/≈G/\approx inherits the structure of an extensional APG which is equivalent to GG. This is the extensional quotient; see extensional relation.

Two extensional APGs are equivalent iff they are isomorphic. For if ∼\sim is a bisimulation from GG to HH, then defining x≈yx\approx y in GG if there exists a node zz in HH with x∼zx\sim z and y∼zy\sim z gives a bisimulation on GG, which must be the identity, and similarly for HH; hence ∼\sim must in fact be a bijection. Thus, the notion of “pure set” obtained from “extensional APGs and isomorphism” is the same as that obtained from “arbitrary APGs and equivalence.”

Every APG is equivalent to one that is a tree, its “unwrapping”. The nodes of the tree can be taken to be the finite paths xn→…→x0=⊤x_n \to \dots \to x_0 = \top ending at the root, with any “one-level extension” xn+1→xn→…→⊤x_{n+1} \to x_n \to \dots \to \top being a child of xn→…→⊤x_n \to \dots\to \top. The bisimulation relates each path xn→…→⊤x_n \to \dots\to \top to its initial node xnx_n.

Moreover, every APG is equivalent to a rigid accessible pointed tree. The key here is to first take the extensional quotient, then perform the above “unwrapping” construction: the unwrapping of any extensional graph will always be rigid.

Two well-founded rigid accessible pointed trees are equivalent iff they are isomorphic. For if ∼\sim is a bisimulation from GG to HH, define a new relation ≃\simeq from GG to HH such that x≃yx\simeq y iff there are paths from xx and yy to the corresponding roots which are pointwise related by ∼\sim, i.e. we have x=xn→xn−1→⋯→x1→x0=⊤x = x_n \to x_{n-1} \to \cdots \to x_1 \to x_0 = \top and y=yn→yn−1→⋯→y1→y0=⊤y = y_n \to y_{n-1} \to \cdots \to y_1 \to y_0 = \top, with xi∼yix_i \sim y_i. We can then prove by well-founded induction that whenever x≃yx\simeq y, then ≃\simeq is a bijection between the children of xx and of yy. For assume that all the children of xx and yy have this property. Then if we have x′→xx' \to x and x″→xx''\to x with x′∼yx' \sim y and x″∼yx'' \sim y, then by hypothesis ∼\sim is a bijection between G/x′G/x' and H/yH/y, and between G/x″G/x'' and H/yH/y, hence we have G/x′≅G/x″G/x' \cong G/x''; thus by rigidity x′=x″x'=x''. Therefore, ≃\simeq is a bijection between the children of xx and the children of yy. The construction of ≃\simeq then ensures that for any child x′→xx'\to x, ≃\simeq relates each node in X/x′X/x' to a node in at most one Y/y′Y/y'; hence ≃\simeq is in fact a bijection from X/xX/x to Y/yY/y. Thus, by induction, ≃\simeq is a bijection.

In order to extend this last result to non-well-founded trees, we would need to use a stronger notion of rigidity. (This makes sense since SAFA is incompatible with AFA. We could expect to be able to use “Scott-extensionality” in place of extensionality to obtain a theory equivalent to that of rigid trees.) For example, the following two ill-founded trees are equivalent but not isomorphic, yet both are rigid.

In conclusion, there are two extreme ways to represent a pure set by an isomorphism class of graphs, but any accessible pointed graph will represent a pure set one way or another. For well-founded sets, the two extreme ways are equivalent, but for ill-founded sets the choice between them can make a difference (and in fact, there are a number of other possible variants, corresponding to the spectrum of anti-foundation axioms discussed by Aczel).

Note that the version on the left is the unwrapping of the loop; we would like a stronger version of rigidity that would rule out the version on the right. As it is, the best that we have come up with is simply: isomorphic to the unwrapping of some extensional APG. Technically this works, but it is not very satifsying.