The elements of an inductive type

In the HoTT book issue 460 a question by gluttonousGrandma (where do people get these nicknames?) once more exposed a common misunderstanding that we tried to explain in section 5.8 of the book (many thanks to Bas Spitters for putting the book into Google Books so now we can link to particular pages). Apparently the following belief is widely spread, and I admit to holding it a couple of years ago:

An inductive type contains exactly those elements that we obtain by repeatedly using the constructors.

If you believe the above statement you should keep reading. I am going to convince you that the statement is unfounded, or that at the very least it is preventing you from understanding type theory.

Let us consider the most famous inductive type, the natural numbers $\mathtt{nat}$. It has two constructors:

the constant $0$, called zero

the unary constructor $S$, called successor

So one expects that the elements of $\mathtt{nat}$ are $0$, $S(0)$, $S(S(0))$, …, and no others. If our type theory is good enough, we can prove a meta-theorem:

Metatheorem: Every closed term of type $\mathtt{nat}$ is of the form $S(S(\cdots S(0) \cdots ))$.

(Incidentally, before another reddit visitor finds it productive to interpret “is of the form” as syntactic equality, let me point out that obviously I mean definitional equality.) This can be a very, very misleading metatheorem if you read it the wrong way. Suppose I told you:

Metatheorem: There are countably many closed terms of type $\mathtt{nat} \to \mathtt{nat}$.

Would you conclude that the type of functions $\mathtt{nat} \to \mathtt{nat}$ is countable? How would that reconcile with the usual diagonalization proof that $\mathtt{nat} \to \mathtt{nat}$ is uncountable?

Notice that I am carefully labeling above as “metatheorems”. That is because they are telling us something about the formal language that we are using to speak about mathematical objects, not about mathematical objects themselves. We must look at the objects themselves, that is, we must interpret the type in a model.

Let us call a model exotic if not every element of (the interpretation of) $\mathtt{nat}$ is obtained by finitely many uses of the successor constructor, starting from zero. Here are some non-exotic models:

In set theory the natural numbers are just the usual ones $\mathbb{N}$.

In the category of topological spaces and continuous maps the natural numbers are the usual ones, equipped with the discrete topology. (Exercise: why not the indiscrete topology?)

In the syntactic category whose objects are types and whose morphisms are expressions, quotiented by provably equality, the above meta theorem tells us that there are no exotic numbers.

And here are some exotic models:

Consider the category of pointed sets. Recall that a pointed set $(A, a)$ is a set $A$ with a chosen element $a \in A$, called the point. A morphism $f : (A, a) \to (B, b)$ is a function $f : A \to B$ which preserves the point, $f(a) = b$. Computer scientists can think of the points as special “error” values. You might try to take $(\mathbb{N}, 0)$ as the interpretation of $\mathtt{nat}$. But this cannot be right because that would force $0$ to be always mapped to the point. In fact, the correct thing is to adjoin a new point $\bot$ to $\mathbb{N}$, so we get $(\mathbb{N} \cup \lbrace \bot \rbrace, \bot)$. The exotic element $\bot$ does not create any trouble because all morphisms are forced to map it to the point.

Consider the category of sheaves on a topological space $X$. The type $\mathtt{nat}$ corresponds to the sheaf of continuous maps into $\mathbb{N}$, where $\mathbb{N}$ is equipped with the discrete topology. The elements corresponding to $0$, $S(0)$, $S(S(0))$, etc. are the constant maps. If $X$ is connected then these are all of them, but otherwise there are going to me many more continuous maps $X \to \mathbb{N}$. For example, if $ X = 2 = \lbrace 0, 1 \rbrace$ is the discrete space on two points, then the object of natural numbers in sheaves on $X$ will constists of pairs of natural numbers because $\mathsf{Sh}(2) = \mathsf{Set} \times \mathsf{Set}$.

It would be a crime not to mention the non-standard models, historically the first exotic models (I hope that’s correct). In these the natural numbers obtain many new exotic elements through the ultrapower construction. (To fit these into our scheme you need to consider the entire set-theoretic model formed by an ultrapower, not just an ultrapower of $\mathbb{N}$.)

In all of these $\mathtt{nat}$ gains new elements through a process of completion which makes sure that the structure obtained has certain properties. In the case of pointed sets the completion process makes sure that there is a distinguished point. In the case of sheaves the completion process is sheafification, which makes sure that the natural numbers are described by local information. I do not know what the ultrapower construction is doing in terms of it being a completion. (The ultrapower is really a two-step process: first we take a power and then a quotient by an ultrafilter. The power is just a special case of sheaves, so that is a completion process. But in what sense is quotienting a completion process?)

The situation is a bit similar to that of freely generated algebraic structures. The group freely generated by a single generator (constructor) is $\mathbb{Z}$, because in addition to the generator we have to throw in its inverse, square, cube, etc. These are not equal to the generator because there is a group with an element which is not equal to its own inverse, square, cube, etc. If the freely generated group is to be free it must make equal only those things which all groups make equal.

It is therefore better to think of an inductive type as freely generated. The generation process throws in everything that can be obtained by successive applications of the constructors, and possibly more, depending on what our model looks like.

Once you convince yourself that the natural numbers really can contain exotic elements, and that this is not just some sort of pathology cooked up by logicians, it should be easier to believe that the identity types may be non-trivial.

Let me further address two issues. The first one is this: in type theory we can prove that every natural number is either zero or a successor of something. Does this not exclude the exotic elements? For instance, in sheaves on the discrete space on two points, how is the element $(0, 42)$ either zero or a successor? It is not zero, because that would be $(0, 0)$, but it is not a successor either because its first component is $0$. Ah, but we must remember the local character of sheaves: in order to establish a disjunction it is sufficent to establish it locally on the elements of a cover. In the case of $\mathsf{Sh}(2)$ we just have to do it separately on each component: the first component of $(0, 42)$ is zero, and the second one is a successor, so all is fine. There is a difference between the internal statement “every number is zero or a successor” and its external meaning.

The second issue is a bit more philosophical. You might dislike certain models, and because of that you deny or ignore arguments that employ them. The reasons for disliking models vary: some people just ignore things they do not know about (“Oh sheaves, I do not know about those, but here is what happens in sets.”), some insist on doing things in certain ways (“We shall only truly understand the Univalence axiom when we have a computational re-interpretation of it in type theory”), and some just believe what they were taught (“Classical set theory suffices for all of mathematics”).

There are any number of theorems in logic which tell us that the “unintended” models are unavoidable (recall for instance the Löwenheim-Skolem theorem). Instead of turning a blind eye to them we should accept them because they allow us to complete an important process in mathematics:

Based on previous mathematical experience, formulate a theory. Examples:

Post navigation

29 thoughts on “The elements of an inductive type”

Thanks for commenting on an important topic. I think the point is that type theory is intended to be a language for describing an existing mathematical reality, and the theorems proved in the language are intended to be about that reality, but focusing on the details of the language tends to obscure that. I was stuck with that impression for half of our special year!

There’s something a bit odd about your definition of “exotic” model, in that it supposes that every model interprets types as things with elements, i.e., as sets. But your examples don’t do that: for example, a sheaf is not a set. A pointed set isn’t even a set.

@Dan: if you pass to models whose objects are not sets, or do not have an underlying set, then you also have to pass to a more general notion of elements. I was going to skip over this issue, but now that you brought it up, thanks for bringing it up!

The ultrapower construction is a completion in the following sense: adjoin all elements whose first-order properties are contradiction-free.

For instance, the family of inequalities $\lbrace x > n \mid n = 1, 2, 3, \ldots \rbrace$ is contradiction-free, because any finite subset of them certainly has a solution $x$. (In particular, take $x = m+1$ where $m$ is the size of the subset of inequalities considered.) Hence, you can adjoin an element $X$ which fulfills all of these inequalities. It’s very similar to Gödel’s compactness theorem.

The word “successive” has this issue buried within it, as that notion cannot be formalized. It means “a finite number of times” which requires a notion of “the number of times”which is done by setting up a mapping from integers below that count to applications. But that count might be a nonstandard integer, and there is no way to formalize that possibility away.

I do have two quibbles. First, it’s true that in general one wants to think about generalized elements when interpreting type theory in a category. However, when you say things like “the elements of nat in sheaves on $X$ are the continuous maps $X \to \mathbb{N}$” I think you have to be referring to the global elements, i.e. maps from the terminal object to the sheaf interpreting nat.

Second, probably you just didn’t want to get into this, but in case anyone reading is confused, the statement that “in order to establish a disjunction it is sufficent to establish it locally on the elements of a cover” is really only true for mere disjunctions (i.e. (-1)-truncated), while the disjunction “every natural number is either zero or a successor of something” is provable even purely (i.e. without truncation). To prove a pure disjunction locally, the elements of the cover have to be disjoint — which is, of course, true for the two points of the 2-element space.

@Barak: Non-standard integers are irrelevant when I speak about syntax and use phrases such as “finitely many times”. Firstly, you cannot bring in non-standard integers because I do not need any math to understand what “finitely many” means, and so I can simply ignore any kind of mathematical and logical sophistry. But even if I accept your game and start to worry that perhaps I live in a model with non-standard integers, that is completely irrelevant. If I live in such a model, there is no way for me to know that (because I am in the model and have no way to observe it from a meta-level), and neither does existence of non-standard integers invalidate any of my reasoning. So I think your worries are empty.

So, here is a silly question I wanted to ask a while ago, just because, but now it seems even more appropriate to do so.

There are all these oodles of marvelous things one can prove about identity types before supposing identity types of any particular type to be interesting; I’m pretty sure one can even in principle write down all the elements of the homotopy of spheres, as polymorphic maps between higher identity types… and the marvelous magical thing is that supposing any number of spaces have interesting homotopy doesn’t break any of that.

There are things one can prove about the freely-adjoint-a-point type (call it Plus for now), one can build adjunctions around it and so on, in as much detail and coherence as you have patience for; So, does anything go wrong with supposing a plus-er inductive type

pInductive ghostly := spectre : Plus ghostly.

Then Plus ghostly will seem to have two points: spectre, and the freely-adjoined point; but in the absence of a retraction of the Plus monad unit morphism, I’m sure there are no terms of type ghostly; and I think I can even half-way think of a model interpreting ghostly in a sensible way.

I think what I’m driving at is: beyond, say, syntactic positivity, is there something special about identity types that makes higher inductive types work? What makes them really nifty of course is living on a diagonal, but if we ever want a proof assistant with native higher inductive types, is it useful to restrict their declarations to freely adjoining higher identity terms, or might it be fruitful to consider other layers of structure, too?

Thanks for writing about a topic that has been puzzling me for a while. I think I understand the issue a bit better now, but I still feel uncomfortable about the way you put it.

I understand that “freely generated” does not mean the same thing in hott and in usual type theory, hence different elimination principles for types freely generated from constructors.

However, I find it abusive and confusing that you are defining “inductive” as just “freely generated”. How then do you distinguish “inductive” from “coinductive” after that?

Finally, I am not convinced by your use of non-standard models to refute the usual intuition of inductive types. If the model of Nat contains elements that are not generated from constructors, they should be indistinguishable from the non-exotic natural numbers for the induction principle to hold. So the intuition is not violated “inside” the logic, and it still means something (necessarily more complex) when considering models of the logic. Another way to put it: it seems to me that the argument you applied to refute the definition of “inductive” could be applied in the same way to refute the definition of “freely generated”, by considering an exotic model containing two distinct objects behaving the same wrt. all interpretations.

I’ve always thought nonstandard elements are an indication that you should carefully consider what you’ve taken to be the propositions.

After all, the natural numbers just are those things which makes the induction axiom true, and because you have some freedom in how you can set up your logical connectives, it seems natural to me that what the natural numbers are may vary as well. My favorite example is light set theory, which is set theory with a linear — rather than classical — ambient logic, and so unary and binary representations of the natural numbers are no longer isomorphic! (It also consistently supports naive comprehension, among its other charms.)

@David: first of all, freely generated means only one thing in all contexts, including HoTT (but in HoTT you have to state it carefully because there is proof relevance to worry about): a structure is freely generated when it is obtained by application of a left adjoint to a forgetful functor. This is the most encompassing notion of “free” that I know of. Inductive definitions are just a very special case of left adjoints.

Coinductive definitions are not freely generated. A coinductive type is not obtained by (possibly non-well founded) applications of constructors, even though Coq would have you believe that. They are coalgebras, and so we have to suitably dualize everything. We get destructors, and co-induction, etc., and this is all very clearly different from free generation.

About your last remark: yes, of course inside type theory the standard and exotic elements have precisely the same status, namely they are elements. It takes a meta-theoretic observation to distinguish them. But it is a mistake to draw the conclusion “it is safe to imagine inside the logic that all elements are standard”. It is also a mistake to draw the conclusion “it is safe to imagine inside the logic that there are exotic elements”. The logic says nothing about the distinction between the standard and exotic elements, and so no conclusion can be drawn. You need at least a bit of meta-reasoning. For instance, you could try to validate your belief by proving that for a free variable $x : \mathtt{nat}$ the infinite set of formulas $$\lbrace{ \lnot(x = 0), \lnot(x = S(0)), \lnot (x = S(S(0))), \ldots \rbrace}$$ entails a contradiction. But you will fail in doing so, because it is consistent to adjoin a new constant $c$ and a set of axioms $\lnot(c = 0)$, $\lnot(c = S(0))$, $\lnot(c = S(S(0))$, etc. And that is just a meta-theoretic way of saying that it is safe to imagine exotic elements. The ghosts of exotic models are there to haunt you, whether you like it or not.

You say at some point that I “refuted the definition of inductive” by showing non-standard models. I take it from this that you think the definition of inductive is “the set of those elements which can be obtained by successive (or well-founded) applications of the constructors”. But this is no definition, it is just an intuition disgued as one. It is a faulty definition because it leads to circular arguments about what “well-founded” or “finite” means. Instead, you must define the inductive type either as an initial algebra for some functor, or as a set satisfying a certain induction principle, or something like that. And that is where the exotic elements creep in! But that is a good thing. Not only have you fixed a faulty definition, at the same time you discovered new mathematical structures, learnt something about the limitations of your formalism, and most importantly, broadened your imagination so you are ready to discover new things.

@Neel: I think your argument applies to the usual first-order model theoretic way of doing things, where the structure of natural numbers is studied in isolation. When the natural numbers are characterized by their universal property inside a mathematical universe (a category), they are determined up to unique isomorphism. This reassures us that we pinpointed the correct notion.

For instance, the non-standard model of arithmetic looks “wrong” only if you study it in isolation. As soon as you look at the entire ultrapower model of ZFC, the non-standard natural numbers fit in perfectly.

It is futile to invent reasons against exotic or unwanted models. Logic is telling us that they are unavoidable. That has got to mean something.

@Andrej, thanks for your answer. I now realize one big mistake: I was thinking “injectivity of constructors” for “free generation”. The latter is obviously stronger, and can indeed be taken as a definition of inductive.

Shame on me for this confusion: I was too eager to map what you’re doing to familiar concepts, such as pattern-matching, induction and coinduction, maybe because the equality example in Section 5.8 looks (to me) more like a pattern-matching than an induction principle. (In e.g. Coq, pattern matching and fix are combined to get induction principles, but it seems useful to separate the two concepts when possible.)

Regarding more the philosophical points of exotic models and intuitions versus formal definitions, I don’t see anything useful to add. But thanks for the example of adding an explicit exotic c constant to nat: I find it a bit shocking but I see that in several respects you can still call the extended nat an inductive type.

@David: the $\mathtt{nat}$ with an exotic constant is inductive in the sense that it still satisfies Peano’s induction axiom. But we have to be a bit careful about it, because the axioms we add are not so innocent. There are infinitely many of them, they are negated equations rather than equations, etc. So it gets harder to interpret the theory.

@Jesse: Peter Lumsdaine brought up a similar question last year. More generally, for what functors $G$ can we have an “inductive type” $W$ with constructors that output $G(W)$? We know what conditions on a functor $F$ to impose so that the constructors of $W$ can have input $F(W)$, namely $F$ should be a polynomial functor (syntactically, it should be strictly positive). However, this isn’t sufficient as a condition on G, as I think your example shows; at least, there’s no obvious way to make sense out of the definition of “ghostly”.

The appropriate condition on top of polynomiality for $G$ seems to be that $G(1)=1$. This is true for $G(X) = Id_X$, but not for $G(X) = 1+X$. Semantically, the reason this makes sense is that if a polynomial functor satisfies $G(1)=1$, then (at least in well-behaved models) it has a left adjoint $L$, and so a constructor of type $F(W) \to G(W)$ can instead be regarded as a constructor of type $L(F(W)) \to W$, which is easier to construct in models and derive an induction principle for. And indeed, this is how we construct models of higher inductive types: we use the fact that the simplicial path object construction $X^{\Delta^1}$ has a left adjoint, the simplicial cylinder $Y\times \Delta^1$.

@Mike: that’s rather nifty! Was this caught in one of the videos? Was it the subject of one of the videos? I do like adjunctions…

And yet, I don’t yet see whatever difficulty you’re seeing with the thing I’m wondering about. What an inductive $\_ + 1$ type really looks like to me is a quantum or thermodynamical system with two degenerate energy levels: some of the dynamics isn’t visible untill the system is excited by something (or because of interference, in case that analogy makes any more sense). But the model I had in mind was just filtered (ordinary) types and filtered maps between them, and where (don’t ask why) $\_ + 1$ is interpreted as shifting the filtration by one.

@Jesse: as @Mike said, we discussed this a bit last year, and it definitely seems like a fun question where there’s a lot to be worked out!

In all the versions of it we considered, though, the condition that the functor preserves the terminal object was essential — or at least, it must be *consistent* that the functor preserves the terminal object, since the new inductive type will allow one to prove that it does.

For instance, in your pInductive Ghostly := spectre : 1 + Ghostly. what elimination principle were you thinking of? If it’s analogous to the familiar cases, and implies that Ghostly is an initial “type X with a point of (1 + X)” then this is inconsistent. Taking any type Y, equipped with the adjoined point inl tt of 1+Y, we get a map f : Ghostly -> Y, such that (1+f) spectre = inl tt; this implies that spectre = inl tt. Conversely, taking any inhabited type y0 : Y, we get f such that (1+f) spectre = inr y0; but this implies that spectre ? inl tt.

In general, this sort of problem arises whenever the free element gives us “new information” — in this case, the information of whether spectre lies in the left or right half of the coproduct. More precisely, if we have a type X with a free element x0 of G(X) in the above sense, then an argument like the above implies that G(1) = 1: given any element of G(1), there’s a map f : X -> 1 such that G(f) sends x0 to that element; but there’s only one possible such map, so all elements of G(1) must be equal.

By the way, @Andrej — may I disagree with you on the lead statement of this post? It seems to me analogous to saying that countable models of set theory debunk the fact “$\mathbb{R}$ is uncountable”.

Exotic models, and the surprising interpretations of inductive types in them, are certainly important for understanding type theory; but what they show is surely just the non-absoluteness of the notions of inductive type, element, and reachability. The external “all elements” is not the internal one; the $\mathbb{N}$ of the model is not the $\mathbb{N}$ of the meta-theory. The only way to read “all elements of $\mathbb{N}$ are reachable by constructor application” as false is to interpret it half-in and half-out of the model — “all elements [in some external sense] of $\mathbb{N}$ [in the model’s sense] are reachable [in the external sense]” — which seems rather cruel and usual treatment for a poor proposition!

(I don’t disagree though with the slightly different points made in the linked section of the Book, about the importance of recognising that e.g. identity types are not freely generated types, but freely generated families.)

@Mike, @Jesse: I was talking with @Peter at the Conference on Type Theory, Homotopy Theory and Univalent Foundations, and he pointed me at the comments on this post. I have a proposal for “consider[ing] other layers of structure” (which I’d been mentally calling “generalized higher inductive types”), though I don’t have anything to say about polynomial functors or adjoints or free structures. The basic idea that I have is that you are allowed to add constructors of any type, but, unless you end in the type you are writing a constructor for, you must specify an inhabitant of the type family you wish to construct, with any explicitly mentioned constructors generalized over. I will give some examples, because I don’t think that was a very good description.

(The _ will get filled in with zero.) The rule that I have in mind for the constructors which get exhibited is that you generalize any constructors which appear as indices (but not those which appear as parameters) in the type (so, in this case, one), and then, in some sense, existentially quantify them (that is, you give an inhabitant of some type which results to specializing that generalized type family, so, in this case, I need an inhabitant of @eq Interval zero x for some x). I have also considered universally quantifying the parameters, in addition to existentially quantifying the indices, so that you’d have to give an inhabitant of the type forall y, exists x, @eq Interval y x. But then I realized that I could probably be more general.

Note that this does not allow you to build an inhabitant of, e.g., False, because you’d have to exhibit an inhabitant of False to get started.

I haven’t (yet) tried to write down a general explicit recursor, though the idea for the computation rule of the recursor is that when you try to follow a match on one of the generalized/higher constructors, you pick the branch corresponding to the inhabitant you exhibited in the definition, and fill in the free variables with the ones you chose for that inhabitant. I suspect that by imposing the appropriate positivity/non-circularity constraints on the inhabitants you’re permitted to specify, you can avoid problems. (But, again, I haven’t tried to write this all out explicitly yet, so I’m not sure that you don’t run into problems somewhere.)

@Peter: You definitely have a point there, the only way to make “all elements of $\mathbb{N}$ are reachable by constructor application” false is to mix external and internal notions. But I think what happens is this: people only know about models in which this difference does not matter, and are therefore not aware of it. Consequently, they think all models have this property.

For instance, a typical hard-core type theorist, as far as I can tell, lives in some sort of a syntactic model. Can he tell the difference between internal and external? Sometimes, e.g., when thinking about functions $\mathbb{N} \to \mathbb{N}$, he is perfectly aware that the ones he can define in type theory are not exactly all of them (not even all the computable ones). But when he thinks about $\mathbb{N}$ he knows that every (global!) point of $\mathbb{N}$ is denoted by some term of type theory. And this intuition carries over to inductive types, and then falsely to inductive types in all models. The net result is that the identity type looks weird.

Which brings me to the following dillema. Suppose I want to explain the difference between internal and external to a person who cannot yet tell the difference, and he has never seen exotic models. Shouldn’t I then show the exotic model and “mix the internal and external” viewpoints to first show him the sins he has commited? How would you explain it?

Screw set theory. I live in the physical universe where when you run a Turing machine, and keep watching forever in the physical universe, you never experience a time where that Turing machine outputs a proof of the inconsistency of Peano Arithmetic.

I tried to explain my thoughts on why this was such an example here, but I’m not sure if I succeeded, and I’m not sure if I’m right.

The L-S theorems, and the existence of non-standard models of arithmetic are both rather tied to first-order logic. If you present PA in a >1 order logic, you do get a much stronger picture of the natural numbers (one that doesn’t allow for members other than the zero and its successors). (There’s some discussion in Chapter 18 of the 3rd edition of “Computability and Logic” by Boolos and Jeffrey for example.)