Computing the Uncomputable

Let me try to explain it without assuming you’re an expert on mathematical logic. That may be hard, but I’ll give it a try. We need to start with some background.

First, you need to know that there are many different ‘models’ of arithmetic. If you write down the usual axioms for the natural numbers, the Peano axioms (or ‘PA’ for short), you can then look around for different structures that obey these axioms. These are called ‘models’ of PA.

One of them is what you think the natural numbers are. For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the ‘standard model’ of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

But there are also nonstandard models of arithmetic. These models contain extra numbers beside the standard ones! These are called ‘nonstandard’ natural numbers.

This takes a while to get used to. There are several layers of understanding to pass through.

For starters, you should think of these extra ‘nonstandard’ natural numbers as bigger than all the standard ones. So, imagine a whole bunch of extra numbers tacked on after the standard natural numbers, with the operations of addition and multiplication cleverly defined in such a way that all the usual axioms still hold.

You can’t just tack on finitely many extra numbers and get this to work. But there can be countably many, or uncountably many. There are infinitely many different ways to do this. They are all rather hard to describe.

To get a handle on them, it helps to realize this. Suppose you have a statement S in arithmetic that is neither provable nor disprovable from PA. Then S will hold in some models of arithmetic, while its negation not(S) will hold in some other models.

For example, the Gödel sentence G says “this sentence is not provable in PA”. If Peano arithmetic is consistent, neither G nor not(G) is provable in PA. So G holds in some models, while not(G) holds in others.

Thus, you can intuitively think of different models as “possible worlds”. If you have an undecidable statement, meaning one that you can’t prove or disprove in PA, then it holds in some worlds, while its negation holds in other worlds.

In the case of the Gödel sentence G, most mathematicians think G is “true”. Why the quotes? Truth is a slippery concept in logic—there’s no precise definition of what it means for a sentence in arithmetic to be “true”. All we can precisely define is:

1) whether or not a sentence is provable from some axioms

and

2) whether or not a sentence holds in some model.

Nonetheless, mathematicians are human, so they have beliefs about what’s true. Many mathematicians believe that G is true: indeed, in popular accounts one often hears that G is “true but unprovable in Peano arithmetic”. So, these mathematicians are inclined to say that any model where G doesn’t hold is nonstandard.

The result

Anyway, what is Joel David Hamkins’ result? It’s this:

There is a Turing machine T with the following property. For any function from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural as input, it halts and outputs

So, take to be your favorite uncomputable function. Then there’s a model of arithmetic such that in this model, the Turing machine computes at least when you feed the machine standard numbers as inputs.

So, very very roughly, there’s a possible world in which your uncomputable function becomes computable!

But you have to be very careful about how you interpret this result.

The trick

What’s the trick? The proof is beautiful, but it would take real work to improve on Hamkins’ blog article, so please read that. I’ll just say that he makes extensive use of Rosser sentences, which say:

“For any proof of this sentence in theory T, there is a smaller proof of the negation of this sentence.”

Rosser sentences are already mind-blowing, but Hamkins uses an infinite sequence of such sentences and their negations, chosen in a way that depends on the function to cleverly craft a model of arithmetic in which the Turing machine T computes this function on standard inputs.

But what’s really going on? How can using a nonstandard model make an uncomputable function become computable for standard natural numbers? Shouldn’t nonstandard models agree with the standard one on this issue? After all, the only difference is that they have extra nonstandard numbers tacked on after all the standard ones! How can that make a Turing machine succeed in computing on standard natural numbers?

I’m not 100% sure, but I think I know the answer. I hope some logicians will correct me if I’m wrong.

You have to read the result rather carefully:

There is a Turing machine T with the following property. For any function from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural as input, it halts and computes

When we say the Turing machine halts, we mean it halts after steps for some natural number But this may not be a standard natural number! It’s a natural number in the model we’re talking about.

So, the Turing machine halts… but perhaps only after a nonstandard number of steps.

In short: you can compute the uncomputable, but only if you’re willing to wait long enough. You may need to wait a nonstandard amount of time.

It’s like that old Navy saying:

But the trick becomes more evident if you notice that one single Turing machine T computes different functions from the natural numbers to the natural numbers… in different models. That’s even weirder than computing an uncomputable function.

The only way to build a machine that computes in one model and in another to build a machine that doesn’t halt in a standard amount of time in either model. It only halts after a nonstandard amount of time. In one model, it halts and outputs In another, it halts and outputs

A scary possibility

To dig a bit deeper—and this is where it gets a bit scary—we have to admit that the standard model is a somewhat elusive thing. I certainly didn’t define it when I said this:

For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the standard model of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

The point is that “0, 1, 2, 3, …” here is vague. It makes sense if you already know what the standard natural numbers are. But if you don’t already know, those three dots aren’t going to tell you!

You might say the standard natural numbers are those of the form 1 + ··· + 1, where we add 1 to itself some finite number of times. But what does ‘finite number’ mean here? It means a standard natural number! So this is circular.

So, conceivably, the concept of ‘standard’ natural number, and the concept of ‘standard’ model of PA, are more subjective than most mathematicians think. Perhaps some of my ‘standard’ natural numbers are nonstandard for you!

I think most mathematicians would reject this possibility… but not all. Edward Nelson tackled it head-on in his marvelous book Internal Set Theory. He writes:

Perhaps it is fair to say that “finite” does not mean what we have always thought it to mean. What have we always thought it to mean? I used to think that I knew what I had always thought it to mean, but I no longer think so.

If we go down this road, Hamkins’ result takes on a different significance. It says that any subjectivity in the notion of ‘natural number’ may also infect what it means for a Turing machine to halt, and what function a Turing machine computes when it does halt.

Related

This entry was posted on Saturday, April 2nd, 2016 at 9:54 pm and is filed under mathematics. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

Post navigation

91 Responses to Computing the Uncomputable

Great post! I think you’ve explained everything very well. The way the algorithm works is that on input n, it searches for a certain kind of object, but it turns out that the existence of that object is not provable in PA. Some models of PA have it and others don’t, and for the ones that have it, it has a nonstandard size. Inside such a world, the algorithm will halt, but only after a very long time, so that the length of the computation has nonstandard size, even though n was standard.

I’m sure you’re right, since you’ve got more knowledge of logic in your left pinkie than I have in my brain. And of course the link I just provided was not for you, but for people who don’t know this theorem.

Nonetheless I think it’s worth warning beginners that “truth” is a touchy business for statements in theories with many elementarily inequivalent models. I suspect that whatever definition of truth you’re talking about, it requires supernatural powers to actually use it to determine if statements in arithmetic are true.

I was just curious what kind of “truth” ( I mean definition) You are thinking about, and summarizing, I can list the following I’ve heard of:

Tarski’s definition of truth based on metalanguage. In fact it may mean formally defined truth relative to meta-theory ( in case PA it is usually Set Theory). So something is true ( in PA) if it is provable in Set Theory.
true may mean – provable in all of the models of PA ( so a lot of sentences is “non-true” which mean completely different thing than “not-true” or false. Among them are sentences which are provable in Set theory, but not in PA. It is far from intuition…

Or true in informal language we use in everyday life if one stand on philosophical position natural language is least kind of language in language hierarchy. In that case we have no tools to check or decide if something really is true in any other meaning than by finding examples or counter-examples for various statements. Of course in this case we are far from deductive reasoning.

There are others theories of truth, not compatible, but at least different than Tarski Truth Theory. In fact there is a lot of them. They are built because Tarski’s approach, useful for ordinary mathematical practice, is far from satisfying in various situations (*** see below). I cannot say what is it relation to PA truth, but probably there is some.

*** First of all Tarski approach eliminates a lot of circular ( self-referencing) sentences. It is a price of eliminating some known paradoxes, but it costs a lot: we through away a lot of pretty good statements. Take for example sentence: “This sentence is true”. It does not harm, but in Tarski approach it is not valid logical sentence!

Another example of issues is related to language hierarchy conception. For example the Tarski language-metalanguage hierarchy has some problems. Take for example the following statement. Let N be a level of (meta)-language. That is: formal language of the theory (say PA) is on level N=0, metalanguage of this theory (say set Theory) on level N=1, (meta-meta)-language on level N=2 etc. Consider the following sentence: “for every N of (meta)-language, this sentence is not provable”. The question is: what is the level of (meta)-language we use here? it looks like or we are creating another hierarchy, or this sentence has no logical meaning. But if so: what is the hierarchy level of Tarski Theory?

In the second one (amplified in the third), I describe a specific interesting finite set, and claim that it cannot be counted correctly in ZFC (defined carefully). The number 12 is the best I was able to do in getting that numerical parameter down to earth.

I just posted on Hamkins; blog a rather quick way to get the result using a very well known and classical previous result. The proof Hamkins gives is the same idea used in proofs of the previous classical result.

Hamkins’ blog does suggest the idea of giving an interesting finite set like I do above, and showing that not only can’t it be computed, it’s value in models of ZFC can be arbitrary (perhaps within a big bound). The deep issue is to do things like this preserving crystal clear naturalness.

Such a challenge could easily be stupid, but they phrased it in a way that made it rather interesting:

The goal is to come up with a computable, finite, well-defined number that’s bigger than the last poster’s number. You can only use already established numbers and notations (the xkcd number counts, as does Graham’s number. But you can’t just one-up that by saying g_65 without defining the recursion here), and you can’t refer directly to the previous number. (If I post a description D of some number x, your number can’t be defined as D “+ 1”. If it turns out to be D+1, fine. Just don’t do anything like quoting the previous entry and putting +1 at the bottom.)

I’ll start us off with nine thousand. Because we all know that anything over nine thousand is fucking unimaginably huge.

Edit: Also, if someone questions the size of your number, you have to be able to prove that it’s really bigger than the previous one.

So far there have been 1476 comments in the ensuing discussion. Needless to say, lots of commenters posted silly answers or broke the rules, and fights broke out. But gradually the numbers increased, and the contest became more interesting.

I haven’t read this whole thread, but I know about the fast-growing hierarchy so I’m not going to present anything naive. My actually-computable finite number – albeit nobody knows its actual structure – is as follows:

Let T be the first-order theory of Zermelo-Fraenkel set theory plus the Axiom of Choice plus the axiom that there exists an I0 rank-into-rank cardinal. This is the most powerful known large cardinal axiom, AFAIK. I think that one such cardinal implies the existence of an infinite number of them, but if not, consider that condition added.

Starting with P = 10:

Search through all Turing machines with at most P states.

Search through all proofs in T of length at most 2^^P that the program halts.

Run all programs such that a halting proof exists, until they halt.

Add up all the running times of all those programs.

Set P to this sum.

Repeat 10 times.

The end result should be roughly equal to the fast-growing hierarchy at the proof-theoretic ordinal of T, plus 1, applied to 10. Since nobody can constructively characterize the proof-theoretic ordinal for second-order arithmetic, let alone ZFC, let alone ZFC plus large cardinal axioms, this computer program should compute a larger number than the fast-growing hierarchy for any ordinal for which we could directly write a computable ordering. Despite that, so long as you can formalize ZFC in first-order logic (already done) and formalize the rank-into-rank cardinal axiom (I see no reason why that shouldn’t be doable) you could actually write the above computer program and run it, given unbounded finite computing power (with no need for any halting oracles).

Can anyone who’s been keeping track of this whole thread tell me whether I won?

Can you beat Eliezer Yudkowsky? I think you should be able to win this kind of contest. I already wrote about your work on large numbers here:

BTW, apparently someone upthread of my quoted contribution had already suggested searching for proofs of haltingness in ZFC+IO, though they didn’t iterate the function, but still, they should probably be considered to have the moral victory w/r/t that thread. (You can always claim the technical victory by adding 1 to a number.)

It’s probably also useful to note that the existence of such a number depends on the ω-consistency (in the sense of Gödel) rather than just simply consistency, which may allow proving that a Turing machine halts while allowing you to run such a machine indefinitely.

Most people would agree that believing Con(ZFC+IO) is tantamount to believing ω-Con(ZFC+IO), but then again, if you believe that, then you might as well believe ω-Con(ZFC+IO+ω-Con(ZFC+IO)) and so on iterated along some provably well-founded ordinal in ZFC+IO….

It’s damned scary when I post an article trying to popularize a result in logic and the first three commenters are Joel Hamkins, Richard Solovay and Harvey Friedman. The discussion can only go downhill from here.

I can’t help but take up the challenge. My set theory is worse than rusty, but if I remember correctly there is a notion of a finite set in ZFC. And I have a vague (false?) memory that PA is much weaker than ZFC. So how do nonstandard models of PA show up in ZFC?

I think I said what I wanted to say about that, primarily in the last section of this blog article, but also here:

One of [these models of the natural numbers] is what you think the natural numbers are. For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the ‘standard model’ of PA.

This is essentially the same as the explanation on Wikipedia. If you don’t like it, welcome to the club!

I’m no logician, but I think that the Gödel sentence is true in any model of PA, provided PA is a consistent set of axioms. It can be stated as G = “G is not provable with PA”. Its negation is “G is provable with PA”. If the negation was true, both G and its negation would be provable with PA. So if PA is consistent, G is true, independently of any choice of model.

We know by the completeness theorem that there is a model M (presumably non-standard) in which G is false, i.e. “G is provable within PA” holds in M. However, what this really means is “M contains a ‘proof’ of G”, i.e. there is a non-standard natural number which codes up a proof of G when seen internally in M from the point of view of the provability predicate. This does not mean that there is a (standard) proof of G.

Thus there is no contradiction. G is not provable (standardly) but “There exists P such that P is a proof of G” is true in M because there are a lot of strange candidates for P that are not meaningful proofs.

Thanks, I was still very confused after John’s answer, but this does help clarify. So if I understand you well, in the model M, the sentence G loses its intuitive meaning, because the provability predicate it contains does not characterizes what is a proof in M.

Tom E and I said the same thing, but in different ways, so comparing them may be helpful.

So if I understand you well, in the model M, the sentence G loses its intuitive meaning, because the provability predicate it contains does not characterizes what is a proof in M.

I’m not sure what you mean by “a proof in M” – people don’t usually talk about “a proof in a model”. This is the only thing I’m worried about in your statement.

On the other hand, I completely agree that in the model M, the sentence G loses its intuitive meaning. That’s the key to this business.

G is a statement that’s all about natural numbers. It’s supposed to encode the English language statement “I am unprovable”. But what it actually says is a bit more like this:

“There is no natural number n that is the number of a proof of this statement”.

(The idea, remember, is that Gödel figured out a way to assign numbers to proofs.)

However, the model M contains nonstandard natural numbers as well as the usual ones. In the model M, one of these nonstandard numbers is the number of a proof of G. So, G does not hold in the model M.

So, G doesn’t hold in M, but that’s because M has nonstandard numbers. We can loosely say that there’s a nonstandard number which is the number of an “infinitely long proof” of G. Perhaps that’s what you meant by a “proof in M”. If so, that’s okay. However, what M really contains is not weird proofs, but weird numbers.

Thanks for the further clarification! I phrased it poorly indeed. What I meant is that the notion of proof provided by the predicate in M is not anymore what Tom would call “meaningful” or “standard”. I assume that “meaningful” and “standard” proofs are the ones associated to standard natural numbers and that only those can make a proposition true, even in M.

When you say “In the model M, one of these nonstandard numbers is the number of a proof of G.”, obviously this “proof” cannot make G true, else the model would be inconsistent.

I’m afraid you keep using words in ways that make me nervous. For example:

I assume that “meaningful” and “standard” proofs are the ones associated to standard natural numbers and that only those can make a proposition true […]

What a proof does is make a proposition provable, not true.

only those can make a proposition true, even in M

The usual thing is to say a proposition “holds” or (in more formal writing) “is satisfied” in M. And it’s not proofs that make statements satisfied in a model. Very very roughly, to see if a statement is satisfied in a model, you go into that model, look around, and see if the statement is true in there. But that’s obviously not the real definition.

else the model would be inconsistent

It’s not models that are inconsistent: it’s theories. A theory T is inconsistent if you can prove both P and not(P) in that theory.

Sorry to be chopping your words apart like this, but logic is a careful subject.

A good start would be to avoid using the word “true”, because “truth” is such a slippery concept — see Tarski’s theorem on the undefinability of truth.

What matters more, usually, are two more precise concepts. One is “provable in theory T” and one is “satisfied in a model M”. Gödel figured out how these concepts are connected:

A statement is provable in a theory T if and only if it is satisfied in _every_ model of T.

The sentence G is not provable in Peano arithmetic, nor is its negation. Thus, G is satisfied in some models of Peano arithmetic, but not in others. We were just examining a model in which it’s not satisfied.

A lot of mathematicians like to talk about the “standard model” of Peano arithmetic. They say a statement holds in the “standard model” if and only if it’s “true”. That’s okay only as long as you understand how little this actually means, thanks to the undefinability of truth. If you’re religious you might say truth is what God knows… but that doesn’t help us much. So we can’t tell which statements are true in the “standard model”.

If there’s a “standard” model then where does it live? Is that question somehow less problematic than the question of where the “real” natural numbers might live? Mathematically, any model of a theory must be constructed within some other theory, and certainly within any given model M of (say) ZFC there is a model of the natural numbers that’s “standard” relative to M, but that just pushes the question back.

More importantly, even if there were a “standard” model, how could we know anything about it? That’s what I see as the relevance of Gödel’s theorem: no description of “the natural numbers” can be complete, so even if there were some unique standard notion of “the natural numbers”, we wouldn’t have any way of distinguishing it from lots of nonstandard ones — and hence there wouldn’t really be any meaning to calling it “standard”.

Thanks for taking the time to chop my words apart, that’s helpful for clarifying my ideas. I now realize that part of my confusion came from the fact that the existence of the “proof” associated to a non-standard natural number in M does not make the proposition G provable, as provability is independent of any model.

In my last post I should have said provable instead of true in both instances.

What is kosher to say is that a proposition P in some language L (or theory T) is true or not in a structure of L (or model of T). Or, one says (synonymously) that a structure M of L satisfies P (or not). If you have a first-order theory T, then Goedel’s completeness theorem roughly says that a proposition P is provable in T if it is true in every model of T.

Anyway, the moral here is that “truth” or satisfaction refers to models; each model gives a particular way of assigning truth values to propositions. Models provide the semantics for a theory. Whereas “proofs” refer to deductions in a theory T: certain strings of symbols formed by applying rules of inference, starting from axioms of T. Such proofs are syntactic in nature.

What is kosher to say is that a proposition P in some language L (or theory T) is true or not in a structure of L (or model of T). Or, one says (synonymously) that a structure M of L satisfies P (or not). If you have a first-order theory T, then Goedel’s completeness theorem roughly says that a proposition P is provable in T if it is true in every model of T.

Of course this is fine (Todd is another guy who knows more logic than me), but personally I prefer to avoid saying “true in a model”, especially when talking to beginners. I prefer to say “satisfied in a model”, to emphasize that this is a technical concept with a technical definition.

The reason is that “true” has rather grand metaphysical connotations. When I was starting in logic I was really caught up in the quest for certainty, eager to know how or whether we can determine what’s “really true”. I think a lot of people go into logic for this reason. So, they need a dash of cold water now and then. Strictly separating the syntactic notion of “proof” from the semantic notion of “satisfaction”, and not granting either of them the title of “truth”, is one way to do this.

Unfortunately a clear definition of “P is satisfied in M”, or “M satisfies P”, seems hard to find on Wikipedia! I think it used to be better. Right now the section satisfiability in model theory says says merely:

In model theory, an atomic formula is satisfiable if there is a collection of elements of a structure that render the formula true.[4]

where [4] is a book on model theory. This would be fine for a hand-wavy introduction, but it’s not really a definition. I think somewhere else on Wikipedia you can still find the inductive definition of what satisfiability actually means.

Indeed, I’m finding all the Wikipedia articles on this stuff to be quite a jumbled mish-mash now. I don’t think it was always this way.

I just ran across this quote from Rota: “The fake philosophical terminology of mathematical logic has misled philosophers into believing that mathematical logic deals with the truth in the philosophical sense. But this is a mistake. Mathematical logic deals not with the truth, but with the game of truth.”

I do know what natural numbers are, and all of you know it as well. Obviously it is good and useful thing to have natural numbers axiomatized. But if some set of axioms you refer to as PA “admits nonstandard models”, then it’s not a definition of natural numbers, but merely a failed attempt to define them.

PA can be very interesting thing in itself, well worth investigating, but it is not a definition of natural numbers. I understand that there is some kind of intrinsic difficulty in giving proper definition. For example you can’t just say “for each natural number n there exists only finite number of natural numbers less then n”, since meaning of word “finite” depends on definition of natural numbers. But really, there is no way to do it properly?

You could define finite sets in ZFC without ever referring to natural numbers and define the set of natural numbers to be the (really a) skeleton of the category of finite sets. Presumably there are models of PA that can be constructed within ZFC whose elements would have bigger cardinality than objects of the category of finite sets and you could declare all of them to be “wrong.” I will let the experts correct me.

You could define finite sets in ZFC without ever referring to natural numbers and define the set of natural numbers to be the (really a) skeleton of the category of finite sets.

You could try, but this would be quite inconvenient because the class of all finite sets is not a set, and ZFC only talks about sets. In other words, ZFC has certain procedures for constructing new sets from old, and “the set of all finite sets” is not something you can construct using these procedures.

In ZFC we usually define an ‘ordinal’ to be a set with certain special properties, define a “finite set” to be one that doesn’t admit a bijection with a proper subset, and define a “natural number” to be a finite ordinal. For example:

and so on. The set of natural numbers is the smallest infinite ordinal. This trick is due to von Neumann.

In their Principia Mathematica, I believe Russell and Whitehead took essentially the approach you mention, e.g. defining 2 to be the class of all 2-element sets. Russell, of course, was the one who noticed the paradoxes that arise if you discuss the “set of all sets”. So, he knew you couldn’t talk about the “set of all 2-element sets”.

To get around such problems, they used a “hierarchy of types” which allows one to talk about “sets of the first type”, “the set of all sets of the first type” (which is a set of the second type), “the set of all sets of the second type”, and so on.

This is not the most efficient approach to mathematics, so it’s rarely used today. On page 379 of Volume I of the Principia, Russell and Whitehead wrote:

From this proposition it will follow, when arithmetical addition has been defined, that 1+1=2.

The proof is actually completed on page 86 of Volume II (first edition). The Principia is widely (and unfairly) mocked for taking hundreds of pages to prove 1+1=2.

We could copy Russell and Whitehead’s approach more efficiently today using ZFC together with the axiom of universes, where the “set of all sets in the first universe” lives in the second universe, etc. But most people working on set-theoretic foundations prefer to define natural numbers as finite ordinals. It’s quicker, and induction over ordinals is fundamental in set theory, so it’s very nice that the ordinary principle of mathematical induction is a special case.

I’m no logician, but I think that the Gödel sentence is true in any model of PA, provided PA is a consistent set of axioms. It can be stated as G = “G is not provable with PA”. Its negation is “G is provable with PA”. If the negation was true, both G and its negation would be provable with PA. So if PA is consistent, G is true, independently of any choice of model.

Your argument is a fine informal argument that G is “true” if PA is consistent.

But this is different than whether G holds in all models of PA. A powerful theorem of Gödel shows that a sentence holds in all models of a theory in first-order logic if and only if it’s provable in that theory. Since G is not provable in PA, there are models of PA in which it is doesn’t hold.

What are these models like? They are nonstandard models in which G has a proof that takes a nonstandard number of steps!

Remember, to say “G is not provable in PA” in the language of arithmetic, we need to concoct a predicate P(n) that means “n is the Gödel number of a proof of G in PA”, and then say “there does not exist n such that P(n)”.

But in a model of arithmetic, n ranges over all natural numbers in this model, even nonstandard ones!

The existence of these weird nonstandard numbers n that make P(n) hold will not shake your conviction that G is true. But they mean it doesn’t hold in all models of PA.

John wrote “The only way to build a machine that computes n+1 in one model and n+2 in another to build a machine that doesn’t halt in a standard amount of time in either model. It only halts after a nonstandard amount of time. In one model, it halts and outputs n+1. In another, it halts and outputs n+2.”

This is the bit that’s shocking to me. There’s a long history of considering Turing computations which are allowed “transfinite” steps/storage, but they look like a single extension of the capability and so still feel relatively intuitive. The idea that there are choices of non-standard versions, and hence different extensions of behaviour, doesn’t sit with my expectations.

Good! I’m glad you’re shocked. I was disappointed when I showed this result to my former student Mike Stay and he just laughed and said it made sense.

But there’s a reason: Mike and his other advisor Cristian Calude worked a lot on Chaitin’s constant Ω which—very roughly speaking—is the probability that a Turing machine will halt.

There’s no algorithm for computing all the digits of this number, and indeed most of its digits depend on which model of arithmetic you’re in, since whether an individual Turing machine halts is, in general, a question whose answer cannot be settled by a proof in Peano arithmetic—and thus, by Gödel’s completeness theorem, this question has different answers in different models of Peano arithmetic!

So, Mike was not surprised.

By the way, Calude has managed to compute first 64 binary digits of Ω:

In many subjects people often try to define “truth” by a correspondence theory, i.e. a sentence is true if matches the external reality. This theory is full of problems, but in mathematics we have the additional problem that there’s no obvious external reality that the sentences should match: we can do physical experiments to test some mathematical statements, but not most. So, while mathematicians informally speak of “truth” just as other scientists, when they get careful they speak of “provability” (where a statement can be derived from the axioms of some theory using some deduction rules) or “validity” (where a statement is satisfied in all models of some theory). Gödel’s completeness theorem says that in first-order logic, provability is equivalent to validity. This is very nice.

Right, mathematical truth is a sort of bounded truth where the boundary is defined by a particular mathematical model which must be declared alongside any “true” statement. Of course, the fact that we use “true” and “truth” across models and in everyday conversation, means that there is a more general concept to which those terms are attached. The real hard question is whether cross-model-truth, or universal truth, is real or an illusion. And this question is self-referring as “real” and “truth” seem to be related concepts.

The completeness theorem for “Tarskian truth” is very nice, but there are some obvious objections to interpreting it as the adequacy of first-order logic with respect to Truth with a capital T. Indeed, the interpretation of each quantifier is just the identical “meta” quantifier, e.g. is “and”, is “or” and so on.

This seems to beg the question to a certain extent, since we are only observing that the rules which capture our informal definition of truth indeed are able to distinguish between true and false statements. The French logician Girard has (perhaps overly) strongly objected to this approach to meaning in logic on these grounds, preferring instead to go the structuralist route and try to understand the combinatorics of provable statements. Wikipedia cites Michael Dummett as a source of dissent wrt to the Tarskian semantics of truth.

It’s perhaps then surprising that the completeness theorem is not a complete tautology. Indeed a form of the Axiom of Choice is required to prove it! This requirement disappears if we require the “truth tables” of our formulas to take values in arbitrary boolean algebras instead of the algebra {0,1}. Somehow, performing this step makes the truth or falsity of a statement “definite”, as a true statement can now only be assigned to 1 and a false statement to 0, creating a nice binary distinction we usually associate with mathematical truth.

I am thinking that if there is a generalization of function in the real number, with the usual function composition (like relation between reals instead of binaries), then the Godel theorem can give (in this case) the same results: the values of some functions cannot be given.
If it is all true, then each graph (with nodes and directed edges) of functions (like neural networks) can be non consistent if it is large enough, because of some functions have not a single output.
But what is the extension that give a computable function? It is sufficient to choose an output result between the possible outputs.

John Baez – I love it when you try to explain new advances and subtleties in math and physics to lay folk! Perhaps my main difficulty in understanding this discussion is the distinction between “proof” in a theory and “satisfaction” in a model. Is it roughly correct to say that models are constructions within a larger theory; for instance, the von Neumann construction of the PA theory by iteratively nesting sets (starting with “{ }”) is a model within ZFC set theory, no? If so, then isn’t saying that a statement P holds (is satisfied) in that model of PA equivalent to saying that ZFC contains a proof of P with the model as hypotheses? Of course, I guess that if we define the natural numbers as a model within a larger and more complicated theory then we have just begged the question.

Perhaps my main difficulty in understanding this discussion is the distinction between “proof” in a theory and “satisfaction” in a model.

Great question! This is fundamental to logic, yet I didn’t explain it, and I find that currently the explanation on Wikipedia is terrible.

A “proof of a sentence S in a theory T” is rather easy to understand if you’ve done some very careful proofs or can at least imagine them. Roughly, a sentence is a string of symbols that’s “well-formed” according to some specific list of rules, like

but not

A theory is a set of sentences called axioms. A proof of the sentence S in the theory T is a list of sentences where each sentence is either an axiom or is obtained from previous sentences using certain rules (called deduction rules), and the last sentence is S.

The harder thing to understand is “satisfaction of a sentence S in a model M of the theory T”.

Is it roughly correct to say that models are constructions within a larger theory; for instance, the von Neumann construction of the PA theory by iteratively nesting sets (starting with “{ }”) is a model within ZFC set theory, no?

Yes. Spelling it out in utter precision is a bit lengthy (which is why I didn’t do it). But you seem to have the rough idea of what a model is. You’re doing a good job of rapidly sketching “the standard model of Peano arithmetic in ZFC set theory”. To be more complete, you’d also need to explain how the arithmetic operations (e.g. addition and multiplication) are defined.

If so, then isn’t saying that a statement P holds (is satisfied) in that model of PA equivalent to saying that ZFC contains a proof of P with the model as hypotheses?

Hmm, this is an interesting twist. You seem to be trying to reduce “satisfaction” to “provability”. That’s not how people usually think of it! Usually people try to keep satisfaction and provability quite distinct when defining them, and later discuss ways in which the two concepts are related.

You’ve given me a lot to think about here, but let me start by saying what’s a bit unusual about your approach. You are thinking of a model of PA as something defined within set theory, and indeed that’s true. But you are being very careful to specify exactly which axioms of set theory we’re going to use, and people often don’t do that. In other words, you’re trying to be very formal not only about the original theory T (in this case PA) but also the “metatheory” T’ (in this case ZFC) that you are using to build models of T. People often start out being more relaxed and informal about the metatheory, and simply ‘set theory’ instead of ZFC. And then, instead of worrying about whether statements are “provable in the metatheory”, as you’re doing, they simply talk about whether statements are true.

This is getting a bit tricky, so let me tell you how I’d explain models and satisfaction if you hadn’t distracted me.

Being pretty informal and sketchy, I might say that a model M of PA consists of a set together with elements and functions

and

that obey all the axioms of PA.

Note the difference: I’m just saying the axioms are true in the model M, not provably true.

And then, I’d say that a sentence S is satisfied in this model M if, when we interpret S a statement about it comes out being true.

(I hope it’s obvious how we interpret a sentence like as a claim about the model. This takes some work to explain carefully, but it’s sort of boring.)

Notice, when talking about sentences in PA I’m being careful to avoid talking about whether they’re ‘true’, and I’m carefully distinguishing between ‘provability’ and ‘satisfaction’. But when I start talking about a model, which is a bunch of sets and functions, I slack off and freely talk about the ‘truth’ of statements. In other words, at that stage I act like a normal mathematician, not a nitpicky logician.

This may seem odd. To avoid it, we would need to choose some axioms of set theory, like ZFC, and then either talk about provability in ZFC, or satisfaction in some model of ZFC. In the latter case, we’ve just pushed the problem back one notch.

You’re suggesting the former case. And this is an alternative worth studying, and people actually do study it. But you have to be a bit careful. While normal matheamticians act like every statement is either ‘true’ or ‘false’ (we may not know which), not every sentence in ZFC is provable or disprovable. So, if we take your nitpicky approach, we’d have to say that for some sentences S, neither S nor its negation are satisfied in the model M. And that’s not something most mathematicians want to say. They’d prefer to say “either S or its negation is satisfied in M, but we can’t decide which using ZFC”.

I guess you could say that most mathematicians like to use the notion of “truth”, and the principle of excluded middle, in their informal reasoning. They then reason informally about formal systems: that’s the idea of “metamathematics”.

John Baez says “They then reason informally about formal systems: that’s the idea of ‘metamathematics’.” So, mathematicians (and logicians?) have a strong sense that they can “know” things informally (such as the truth of G) that cannot be proved in a formal system. That is, our brains are somehow more than just wet, squishy Turing machines (as argued by Penrose). Although, I doubt that you want this discussion to devolve into that debate!

So, mathematicians (and logicians?) have a strong sense that they can “know” things informally (such as the truth of G) that cannot be proved in a formal system.

No, I really don’t think most mathematicians or logicians have this strong sense. It’s a controversial topic, and as you note it’s a controversy I don’t want to talk about here. It’s pretty irrelevant.

I haven’t penetrated into the more advanced aspects of model theory, so I don’t really know how the true sophisticates think about it. My impression is this: in introductory texts people often study models in the context of ‘unformalized’ set theory. Here you feel free to talk about statements being ‘true’ or ‘false’, rather than ‘provable’ or ‘disprovable’ starting from some explicitly stated axioms.

More advanced work probably says ‘we’re using ZFC’, or whatever. But in what I’ve read, people are often willing to relax — at least some of the time — and use the words ‘true’ and ‘false’. It’s very convenient. So even robots, when they start doing math, will probably do this.

I think the subjectivity enters the picture with order. Kevin Knuth has studied this extensively and, as he has shown, the induction postulate, i.e. “the ordinary principle of mathematical induction,” is a special case of additive measures and additivity in measure theory is a consequence of order and symmetry: http://fqxi.org/data/essay-contest-files/Knuth_knuth—fqxi-essay—.pdf.

Knuth makes the argument that we, as subjects, impose the order. In a sense one could say that the symmetries are objective in that we, as subjects, seem to observe them in the “reality” we seem to be embedded in; however, in both instances I would place emphasis on the word “seem.”

And of course this makes sense because it would seem (there’s that word again) that the order imposed would certainly affect if not infect “what it means for a Turing machine to halt, and what function a Turing machine computes when it does halt.”

I enjoyed reading your post; I have two comments; the first one was also posted on Joel Hamkins’ blogpost.

The result proved by Hamkins is an immediate consequence of putting the compactness theorem for first order logic together with the existence of an effectively given infinite sequence of sentences such that no nontrivial propositional combination of them can be proved in PA; the latter result was proved in the early 1960s, independently by Mostowski and Kripke. Rasmus Blanck has recently written a survey paper on this topic.

However, it should be pointed out that Hamkins’ proof of the existence of the aforementioned effective sequence of sentences is new and perspicuous.

Hugh Woodin established a more dramatic result along the same lines, which is much harder to prove, a result that was recently extended in the joint work of Blanck and myself [Marginalia on a Theorem of Woodin, to appear in the Journal of Symbolic Logic; also available on researchgate]; Vika Gitman has just written a nice blogpost about this work:

“So, conceivably, the concept of ‘standard’ natural number, and the concept of ‘standard’ model of PA, are more subjective than most mathematicians think. Perhaps some of my ‘standard’ natural numbers are nonstandard for you!”

The ‘meaning’ of the new predicate “is standard” in Nelson’s IST has actually been discussed in the context of fragments of IST (based on higher-type PA). Surprisingly, one can intuitively read ‘is standard’ as ‘is computationally relevant’, going back to ideas of Lifschitz, Berger, and Schwichtenberg.

Excellent! I’ve long been curious about whether Nelson’s ideas on nonstandardness can be pushed forward. I’ll have to read these.

What I’d really like is a framework where it could be proved that in certain models, certain specific very large natural numbers are nonstandard. For example, the number described by Yudkowsky in this post, which is constructed using the hypothesis that ZFC plus the axiom that there is an I0 rank-into-rank cardinal is consistent. This particular number is ‘computable’ (in theory), so maybe it’s hopeless to prove it’s ‘nonstandard’. Maybe a number constructed with the help of the Busy Beaver function might be a better candidate. But I don’t have any idea how to do it.

in certain models, certain specific very large natural numbers are nonstandard

Wouldn’t that imply that there are other models in which those numbers don’t exist (or at least, can’t be counted to)? If so, I don’t see how you can reasonably call them (in our informal, intuitive language) “specific natural numbers”.

So I can offer the following observation regarding nonstandard numbers:

Take a very weak system of arithmetic, like Buss’ system . In particular does not prove the totallity of the exponential function, i.e. that .

Now, in the system one can define the length function , which is essentially the rounded-off logarithm. However, cannot prove that the length function is unbounded, i.e. in certain models of the length function is bounded above.

In certain nonstandard models of , one thus has that for a nonstandard number , the length can be standard. Hence, there are standard numbers which become nonstandard when taking the exponential.

There is nothing special about bounded arithmetic here:

One has the same situation for Hilbert’s PRA and the Ackermann function.

1) The Ackermann function is not primitive recursive, and hence PRA cannot prove this function’s totality.

2) The inverse Ackermann function is primitive recursive, but PRA cannot prove that it is unbounded. Hence, there are models where is bounded above.

3) In certain nonstandard models of PRA, one thus has that is standard for certain nonstandard .

Wouldn’t that imply that there are other models in which those numbers don’t exist (or at least, can’t be counted to)?

Indeed it’s a huge challenge to accomplish what I’m dreaming of. Sam Sanders explained some great progress towards doing this… but maybe I should explain less technically what the rough idea is here.

Nelson has argued that perhaps a number like is nonstandard. What he means is that you cannot actually write this number as

Of course if you believe is standard, you will laugh and say “Of course you can do it! Just write 1’s.” But he will respond: “Okay, show me how it’s done.” And you may become very angry and say he’s being unreasonable, but you will not be able to show him how it’s done.

The big challenge is to formalize this idea somehow. Sam Sanders is talking about systems where you can do it. Needless to say, these systems will seem deficient or defective to anyone who is convinced that when is standard, so is

The may seem a little less crazy if instead of using we use a function that grows much faster. That’s why in my earlier comment I chose a function that’s about the fastest-growing function people have been able to invent, at least within the class of recursive functions. This function only exists if you believe the existence of a certain large cardinal is consistent with ZFC. This type of cardinal, called an I0 rank-into-rank cardinal, is about the biggest anyone has been able to dream up.

(When I say “about the fastest” and “about the biggest”, I mean this: I can easily make functions that grow faster or cardinals that are bigger, but I only know how to do it starting with the ones given and doing a silly trick. It’s sort of like how if you say “a googol”, I can say “a googol and one”.)

Thanks for expressing this nontechnically, John. It did seem crazy at first – how can a fixed integer become so qualitatively different that it deserves to be labeled as a different kind of thing, just by virtue of being large?

I imagine though that if a number gets big enough, it may become impossible to perform certain operations on it, or to reason in certain ways about it, or compare it with other numbers, or even describe it. Maybe at some threshold these limitations become so pronounced that categorizing numbers above the threshold differently makes sense.

Do considerations like this have anything to do w/ what you’re getting at?

As pointed out in the post, Hamkins’ proof involves Turing machines running for a nonstandard number of steps.

As it happens, this notion is useful in NSA (Nelson’s IST) as follows:

Call a total Turing machine “standard total” if for standard input it halts after standard many steps with standard output. Call it “standard partial” otherwise.

One can then prove theorems about “standard partial” objects, but this is more than a curiosum: From such theorems, one can extract relative computability results from fragments of IST using the term extract algo in [1].

Reverse math results concerning the Gandy-Hyland functional
(not involving NSA) have been obtained in this way in [2].

Thanks for these references! I’ve been curious about nonstandard models of arithmetic for a while – perhaps because Edward Nelson was another student of my advisor Irving Segal, and I met my wife through someone who worked with Yesenin-Volpin. But I haven’t gotten around to reading the literature about these models. Your papers may be a good way for me to get started.

I don’t know what a ‘proper’ definition is. But there’s no definition of natural numbers that eliminates the ambiguities due to the existence of nonstandard models. If we define natural numbers with the help of set theory (e.g. as finite ordinals), then we push the ambiguity one step back, into nonstandard models of set theory! But it’s still there.

A similar question came up on Google+. So far here on Azimuth we’ve been talking about first-order logic, where you can quantify over variables but not predicates. Mike Stay asked about arithmetic in second-order logic:

So there’s a “feeling” amongst mathematicians that we “know” what standard natural numbers are, we’re just unable to axiomatise it in a way that includes only those. For set theory, do we have a similar “feeling” for what the model ought to contain and what’s non-standard? (I get the feeling not.)

As you guessed, the “feeling” that there’s a unique best model of set theory is weaker than the corresponding feeling for arithmetic. But the people who really deserve to have strong feelings about this — professional set theorists who spend their lives studying these issues — tend to have this feeling. And they’re making some remarkable discoveries that partially, but only partially, confirm this feeling.

It’s not clear where they are heading. It’s as if they’re slowly digging their way toward something big — but it’s still mostly buried, so we don’t know what it is.

There are lots of theorems involved here, many quite technical, but luckily Harvey Friedman, one of the world’s experts on this material, has written a very charming summary of what’s going on, over on the n-Category Café, where we are having another conversation about the foundations of mathematics.

I’ll quote it here.

Let me try to clarify the situation with regard to this crucially important matter for foundations of mathematics.

Let me try to clarify the situation with regard to this crucially important matter for foundations of mathematics.

1) From the standard set theoretic point of view, we have the so called preferred model of set theory normally written as , where is membership.

2) Set theorists generally believe, or like to wear the hat, that has an objective reality, and it is NOT to be thought of as some special construction within any more general framework. We can debate if this is the case, and if it is thought to be not the case, then is this “more general framework” really philosophically coherent, or is it a mirage, etcetera…… Not the topic of this post.

3) Cantor raised his famous continuum hypothesis, usually written CH. Set theorists and others, notably Hilbert, tried to resolve this question within ZFC, and also with ZFC extended by strongly inaccessible cardinals, and measurable cardinals, and large cardinal hypotheses generally, and not only failed, but actually provably failed. In what sense?

4) In the important sense that people found natural constructions of models of ZFC and more, other than , in which they could prove that CH holds, and others in which they could prove that CH fails.

5) There is a little bit of a fib in 4). Goedel constructed a very natural model of ZFC in which he showed that CH holds. Cohen constructed a very natural collection of continuumly many models of ZFC in which CH fails. Some refinements pretty naturally construct a countable family of countable models of ZFC in which CH fails. If algebraic models (Boolean valued models) are allowed, then Cohen did construct a single individual natural model of ZFC in which CH fails.

6) It is apparently impossible to construct a single natural model of ZFC in which CH fails, like Goedel did for CH holding (in the usual sense, not allowing algebraic model). HOWEVER, the family of models of ZFC in which CH fails are ESSENTIALLY ALL EQUIVALENT TO EACH OTHER, AND SO THERE IS A GOOD SENSE IN WHICH COHEN ACTUALLY GAVE “ONE” NATURAL MODEL OF ZFC IN WHICH CH FAILS.

7) This pattern repeats itself for ALL so called intensely set theoretic statements that we have studied. Namely, there is a natural single model of ZFC in which they hold, and a natural (essentially) single model of ZFC in which they fail. Sometimes we can eliminate “essentially”, but not in the case of CH.

8) And people have argued in many situations in 7, which model or models do they accept as “correct” or “like the real one ”, etcetera.

9) I am painting in 1-8 a philosophically coherent picture of what generally is going on, but there are some subtle points known to experts that I have glossed over, and I would like them not to think I am cheating or that I am stupid.

10) Now let’s come to statements that are not set theoretic, and in fact, reasonably concrete. This notion is subject to analysis in standard f.o.m. normally in terms of the standard complexity measures for mathematical statements. I don’t want to get into this crucially important part of standard f.o.m. in this posting.

11) Suffice it to say this. First of all, so called arithmetic sentences are way way down lower than the intensely set theoretic statements referred to above, especially CH.

12) The crucial point is this. IN ALL OF THE NATURAL CONSTRUCTIONS WE KNOW OF MODELS OF SET THEORY, AND ESPECIALLY THOSE THAT WE VIEW AS CANDIDATES FOR THE REAL , WE KNOW A PRIORI THAT THEY ALL SATISFY THE EXACT SAME NON SET THEORETIC STATEMENTS. IN PARTICULAR, WE KNOW THAT THEY ALL SATISFY THE SAME ARITHMETIC STATEMENTS.

13) So unlike the situation with CH, in dealing with arithmetic statements that might not be decided in ZFC, we are going to have to work with provability with extensions of ZFC, and not, like in the case of CH, go grab and lobby for some model of ZFC being preferred in some way.

14) OK, so now we are looking at an arithmetic sentence, and its provability in various set theories (or, speaking of the devil, maybe in various alternative foundational frameworks, see below).

15) So, like CH, maybe we are going to have the following situation. Alice’s set theory proves arithmetic sentence A is true, Bob’s set theory proves A is false. And then people “decide” which, if any, has a correct set theory.

16) Well, here is what is actually going on. FOR ANY TWO SET THEORIES ANYBODY THINKS ARE NATURAL, WE (GENERALLY) KNOW THAT EITHER EVERY ARITHMETIC SENTENCE PROVABLE IN THE FIRST IS PROVABLE IN THE SECOND, OR EVERY ARITHMETIC SENTENCE PROVABLE IN THE SECOND IS PROVABLE IN THE FIRST.

17) So the only actual criteria for choosing set theories with regard to arithmetic sentences (and more) that anybody can do anything with, is simply: which proves more of them – mine or yours?

18) Therefore, given the path that we are on, WE ARE NEVER GOING TO DIFFER IN OUR OPINION OF THE TRUTH VALUE OF ANY ARITHMETICAL SENTENCES. WE WILL ONLY DIFFER AS TO WHETHER WE SHOULD ACCEPT THEM AS HAVING BEEN SETTLED.

19) This phenom is intimately connected with the matter of INTERPRETATIONS BETWEEN THEORIES. In general, any interpretation of one natural set theory in another also immediately proves that every arithmetic sentences provable in the first is provable in the second.

20) We have found out that for any two natural set theories, either the first is interpretable in the second, or the second is interpretable in the first.

21) In fact, there is a stronger observed fundamental phenom. For any two natural set theories, either the first is interpretable in the second, or the first proves the consistency of the second.

22) FURTHERMORE, seriously alternative foundational frameworks are not going to change this situation at all, as people are and want to continue establishing interpretations between them and roughly ZFC. So whatever good alternative foundational frameworks are, they are not coming to the rescue here.

23) Another difference between set theoretic statements like CH and arithmetic sentences.

24) Although there are tons of very natural set theoretic statements known to be independent of ZFC, there isn’t a single natural mathematical arithmetic statement known to be independent of ZFC. (For finite set theory, there are, Goodstein, Paris/Harrington, and pretty much superseded by the Adjacent Ramsey Theorem, and some others). None for ZFC except for one researcher, with the very latest being written up by him/her because the naturalness looks perfect.

This is all very fascinating, and it compels me to immediately ask two (possibly very naïve) questions.

If I correctly understood the gist of this 25-point summary, for each “intensely set theoretic” statement which is undecidable in ZFC it has always happened so far that, up to some notion of “essential equivalence”, there are just two “natural” choices of a candidate for the phantomatic “real” model : one in which said statement is true, and one where it is false. And in general, the experts don’t agree on whether the statement should be held to be true or false.

On the other hand, it’s always turned out to be the case, so far, that either one of the models is interpretable in the other or vice versa, hence that one of the two models proves more arithmetic statements than the other. (I’m not sure I got the relevant points (18)-(21) right, though, so forgive me if the questions below won’t make sense.)

So of course this pattern suggests that, once we’ve quotiented out essential equivalence, all such natural set-theoretic models will form a total ordering, and that by following this ordering we will be pointed in the direction of the Holy Grail, . I suppose here the expert all agree we want to go towards greater arithmetic proof strength, right? At least, for the purpose of deciding which set theory is the most natural and desirable as foundations.

Now let’s assume that this trend will at some point actually be made into a theorem. (In particular, each of the above words in double quotes, except for “real”, will now have a precise, settled meaning.) My first question is, would this suffice to determine up to essential equivalence? By which I mean, is there some understood way to “take the (co?)limit” of this ordered class of equivalence classes of models, or would this be a whole other new problem in need of a separate solution?

My second question is: according to this arithmetic proof-strength criterion, how should we settle CH? By choosing Gödel’s model, or Cohen’s?

If my questions reveal some misunderstanding, I would greatly appreciate to be corrected!

I kind of doubt Harvey Friedman is reading this thread anymore, so I’ll give my own impression based on his 25-point comment above and other sources.

First question: I believe that logicians are empirically discovering an ordering of interesting additional axioms for arithmetic, where the “better” axiom proves all the arithmetic statements that the “worse” one does. The ordering of these axioms, called “proof-theoretic strength”, can be measured using countable ordinals. For a good introduction to this idea, see:

Over in set theory, a vaguely analogous trend seems to be emerging, where large cardinal axioms—axioms asserting the existence of various kinds of large cardinals—let us prove new things about set theory and analysis. These axioms seem to be approximately linearly ordered, but this is more mysterious to me, since while cardinals are well-ordered, I don’t see how that puts a linear ordering on large cardinal axioms.

Now let’s assume that this trend will at some point actually be made into a theorem. (In particular, each of the above words in double quotes, except for “real”, will now have a precise, settled meaning.) My first question is, would this suffice to determine up to essential equivalence? By which I mean, is there some understood way to “take the (co?)limit” of this ordered class of equivalence classes of models, or would this be a whole other new problem in need of a separate solution?

I believe this is a whole other new problem in need of a separate solution. I believe people aren’t ready for this problem yet. (I’d love to be wrong—I’ve wondered about this.)

Second question: it seems that the continuum falls outside the general trend: neither the continuum hypothesis nor its negation has greater arithmetic proof-theoretic strength. A closely related fact is that neither the continuum hypothesis nor its negation imply, or are implied by, known large cardinal axioms. Joel David Hamkins has some papers on this. This seems to be a less technical one:

John Baez said:
“I don’t know what a ‘proper’ definition is. But there’s no definition of natural numbers that eliminates the ambiguities due to the existence of nonstandard models. If we define natural numbers with the help of set theory (e.g. as finite ordinals), then we push the ambiguity one step back, into nonstandard models of set theory!”

I’m not sure I understand how von Neumann construction of natural numbers ‘pushes ambiguity on step back’. Is it because it relies on axiom of infinity?

The von Neumann construction of natural numbers relies on set theory, and any axioms of set theory, for example ZFC, admit many nonstandard models. In a nonstandard model of set theory, the “standard model” of arithmetic can look quite strange.

I am very happy to discover that Sam Sanders has investigated what these “standard models” can be like:

Gödel’s incompleteness theorems demonstrated once and for all that our knowledge of a sufficiently rich structure is highly dependent on the choice of higher order formal frameworks through which the structure is viewed.

By the way, Steve, both in this question and your earlier question we get into the issue of studying one logical theory inside another, then studying that inside yet another, etc. This can get a bit confusing, but it’s important to think about.

A follow up on John’s comment about “studying one logical theory inside another”: logicians consider the standard model of arithmetic to be defined and studied within a background theory of sets. In the G+ counterpart to this discussion Joel Hamkins mentioned a paper of mine that probes the class of models that arise as the standard model of arithmetic within a model of set theory; the paper is entited “Standard models of arithmetic”, and it appears on p.55-64 of a festschrift volume available through the following link:http://flov.gu.se/digitalAssets/1497/1497175_ideesfixes.pdf

Ali, does your paper, or similar results, provide the formal justification for John Baez’s statement that. “… there’s no definition of natural numbers that eliminates the ambiguities due to the existence of nonstandard models.”? Or, is John’s remark simply a judgement born of experience.

By the way, with my background (one course in logic, one serious attempt to read Cohen’s book on the continuum hypothesis, one topology professor who got sidetracked with a new enthusiasm for nonstandard analysis, and a lifelong curiosity about these things) I’m afraid that I am in no position to read technical papers.

I’ll be interested to hear what Ali has to say about your question, Steve. He’s thought about it much harder than me! But I can’t resist saying a bit of basic stuff myself.

Or, is John’s remark simply a judgement born of experience?

I’d say it’s a judgement born of theorems. Let me list the main ones, even though you probably know them already:

• Tarski’s theorem on the arithmetical undefinability of arithmetical truth: we can encode sentences in the language of arithmetic as numbers (“Gödel numbers”). But there is no way to take arithmetic and include a predicate such that is true if and only if the th sentence is true, without making the system inconsistent. This result applies not only to arithmetic but any sufficiently strong formal system. It’s not limited to first-order logic. The proof is quite short.

This theorem puts limits on our ability to define “truth” and thus on our ability to say what we mean by “the standard model” of the natural numbers, which is presumably the model where only “true” sentences are satisfied.

• Gödel’s first incompleteness theorem: any finite set of axioms extending Peano arithmetic is either inconsistent or incomplete. This also applies to infinite sets of axioms that are recursive, i.e., can be listed by a computer program.

This theorem is usually stated only for first-order logic. In first-order logic, whenever an axiom system is incomplete it has more than one model. Whenever it is inconsistent it has no models. Thus, there is no way to take Peano arithmetic and add a recursive set of axioms to get an axiom system that has exactly one model. This again puts limits on our ability to pick out “the standard model” of the natural numbers.

Note that if we could choose as our axioms every statement about arithmetic that is “true”, those axioms would have only models that are “elementarily equivalent” to the sought-for “standard model” — meaning, models in which all the same statements hold. Unfortunately, there are infinitely many different models of this sort. “Elementarily equivalent” does not imply “isomorphic”.

Furthermore, it’s known that the set of “true” sentences cannot be recursive. This follows from Tarski’s theorem on the undefinability of truth. So we can’t in practice work with a first-order axiom system containing all “true” sentences of arithmetic.

What about second-order logic? One of its great charms is that we can write down a version of the Peano axioms in second-order logic that has a unique model! This sounds great. But second-order logic has its own problems, which act to dramatically undermine the promise of this result. As I explained elsewhere in this discussion, to really understand what a model of a second-order theory is, we need to use a first-order theory, and we’re back in the frying pan.

• Gödel’s second incompleteness theorem: any finite set of axioms extending Peano arithmetic cannot prove its own consistency unless it is inconsistent. This also applies to infinite sets of axioms that are “recursive”, i.e., can be listed by a computer program.

This theorem is usually stated only for first-order logic. As mentioned, in first-order logic, whenever an axiom system is inconsistent it has no models. Thus, in any sufficiently strong axiom system where a computer program can list the axioms, we cannot prove the system has any model at all… unless it doesn’t. This puts limits on how sure we can be that a “standard model” even exists!

I’m sure there are other results that I don’t know, that flesh out this picture. But it’s already clear that if there really is a unique thing deserving to be called “the standard model” of the natural numbers, there are strong limits on our ability to access it.

On the other hand, many expert logicians are still convinced that there is such a thing as “the standard model” of the natural numbers. For that side of the argument, read this comment by Harvey Friedman. He knows a lot more logic than I do. I think our disagreements are not about theorems, but about what’s the right attitude to take to them.

I have some comments about nonstandard models of arithmetic and other theories. It ideally should go on your Azimuth, but that has gotten so involved structurally, that I just don’t know a good place to put it. Perhaps you can copy this and stick it in there somewhere.

The “standard” view of nonstandard models has not been very well represented on the Azimuth thread. Here is the hard line version.

1) As far as PA (first order Peano arithmetic) is concerned. The “standard” view is that there is absolutely no such thing as a nonstandard integer. The phrase arose for purely technical reasons connected with PA Incompleteness and technical tools and technical questions and technical dreams that have been unrealized.

2) A nonstandard integer is not anything at all, but a point in a model of PA infinitely many predecessors in that model. Thus nonstandard integers live only in so called nonstandard models of PA. These are the models of PA that are not of order type omega. There is only one model of PA of order type omega up to isomorphism.

3) To date, there has not been any reasonable preferred nonstandard model of PA ever presented – but I am not quite sure about something decades ago, which I will mention below. Hopefully somebody reading this can clarify.

4) There is a reasonably preferred and interesting category of closely related such. These are the ultra powers of the standard model via any non principal ultrafilter on omega.

5) However, this construction is not very satisfactory for two reasons. One is that these ultra powers are not isomorphic to each other (I don’t know how much the isomorphism types of these have been studied). Secondly, all of these ultra powers satisfy the same sentences as the standard model of PA.

6) So this renders these nonstandard models prima facie useless for independence results from PA. In contrast, in set theory, one does have a lot of interesting ways of constructing models which differ greatly in what sentences hold and fail, thereby getting lots of interesting Incompleteness from ZFC.

7) There have been some people, including Nelson and later Sanders, and others, who are attracted to the idea that the standard/nonstandard distinction should be viewed as fundamental, and can provide an alternative f.o.m. that has some degree of serious radicality.

8) As I said before, I am a foundational whore, and have played with this also. As usual, whores don’t like to endorse anything.

9) However, this approach of taking the standard/nonstandard integer idea seriously, has the major problem of losing one of the great impressive features of standard f.o.m., adding greatly to its coherence and robustness. And that is the strong internal categoricity properties of standard f.o.m.

10) There was a development I think attributed to Kochen and Kripke, maybe late 1970s, which purported to give an at least somewhat preferred model of PA plus the negation of the 1-consistency of PA. Thus it would satisfy a false Pi-0-2 sentence. I think it was by a restricted ultra power construction, where the ultra filter only applies to some sets. I’m not sure how preferred this model was, and I don’t remember the details.

11) In any case, it doesn’t come close to fulfilling the technical dream of being able to manipulate models of PA to get independence results with anything like the flexibility and power that one has for ZFC.

12) As a technical tool, I have been using models of ZFC with nonstandard integers for almost 50 years. I have to, in order to get perfectly natural arithmetic independence results from ZFC.

13) Sanders also emphasizes his use of nonstandardism as a technical tool in addition to being something allegedly foundational.

As I said before, I am a foundational whore, and have played with this also. As usual, whores don’t like to endorse anything.

I think it’s worthwhile to read back in the n-Category Café discussion to see what he means here. He had written:

I am a philosophical and foundational whore. I am interested in obtaining substantial results that are suggested by the relevant foundational debates. They are designed to attack or defend any even unreasonable point of view that lends itself to such results.

2) A nonstandard integer is not anything at all, but a point in a model of PA that is of order type omega. There is only one model of PA of order type omega up to isomorphism.

I meant

2) A nonstandard integer is not anything at all, but a point in a model of PA infinitely many predecessors in that model. Thus nonstandard integers live only in so called nonstandard models of PA. These are the models of PA that are not of order type omega. There is only one model of PA of order type omega up to isomorphism.

Are there any known examples of such theorems where, by arguments outside PA itself, it can be shown that can only be satisfied by a nonstandard , whatever model of PA we choose?

I’m guessing no, so I’ll be pleasantly shocked if the answer is yes.

We can’t pin down “nonstandard” within PA itself, but if we could show within some larger context that, for example, is only possible for greater than the proof number for a proof of the Gödel sentence, that would sound like a good argument that must be nonstandard.

The answer to question is no, as you guessed. The quick explanation is that:

1) The soundness and completeness theorems say that a sentence in first-order logic is provable in some theory T iff it holds in all models of T, and

2) PA has a standard model, so

3) if you can prove in PA, there exists an element in any model, including the standard model, for which holds.

A bit more lengthily:

We can define and study the concept of ‘model’ within some formal version of set theory. Let’s take ZFC to be specific. Inside ZFC we can say that the standard model of PA is the one whose order type is the smallest infinite ordinal. In ZFC we can prove such a model of PA exists and is unique up to isomorphism.

(Note: this just pushes back all the ambiguities about “what things really mean” from PA to ZFC. Just as we can study models of PA, we can study models of ZFC! And these can look strange. For example, there are countable models of ZFC, even though in ZFC you can prove there are uncountable sets. But never mind! We will work inside ZFC, not peer outside.

Most logicians would not emphasize this, so I feel a bit silly doing so, but I want to make it clear that there’s no free lunch: we’ve got a nice definition of the standard model of PA inside ZFC, but ZFC itself has many models.)

In ZFC we can prove the soundness and completeness theorems, which imply that a sentence in PA is provable iff it holds in all models. Thus, any sentence provable in PA must hold in the standard model.

Just an observation: from my very limited exposure to the field it seems to me that in order to imbue their scribblings with meaning and interest, logicians must discuss their results using meta-mathematical reasoning that sounds suspiciously like the normal sort of reasoning that other mathematicians routinely employ (I once found myself scratching my head when a logician tried to explain how his formal argument proved the validity of mathematical induction by using mathematical induction in his meta-mathematical explanations). For that reason I stopped thinking of logic as providing a foundation for mathematics and started thinking of it as just another field of mathematics. A very interesting field, to be sure!

Yes, mathematical logic is just a branch of mathematics: it doesn’t actually “support” or “hold up” mathematics, as the term “foundations” suggests. Mathematics would not collapse if everyone suddenly forgot about ZFC. In fact, almost all mathematicians, forced at gunpoint to recite the ZFC axioms, would wind up pleading for the lives. There’s a reason: none of the grad students at the universities I’ve been to are required to take any courses on logic or set theory. And yet they seem to do pretty well.

One thing mathematical logic really does is systematize mathematical reasoning in a way that lets us study it using mathematical tools. So, the term “metamathematics” seems more appropriate than “foundations of mathematics”.

“… they should also learn more category theory …” – when I was in school category theory was fairly new and the prevailing attitude (at least, my attitude formed of ignorance) was that it was abstract nonsense! You have convinced me otherwise!

But the whole idea of mathematicians complaining about too much abstraction is a bit laughable. When mathematicians complain that category theory is “abstract”, I tell them that’s like swimmers complaining that water is wet. If they don’t like being wet, why the hell are they swimming?

And I like the chutzpah of my friend Steve Lack. He argues, very convincingly, that the problem with most mathematics is that it’s not abstract enough: you have to do all sorts of grungy calculations when ideally you’d just be able to see that things are true.

Ideally Peano Arithmetic would be complete and all theorems would have proofs in one page.

Hard to see how this kind of wishful thinking helps understand anything.

Unless you think this is practicable, in which case, let me know when you can “just see” any of several intrinscally quantitative statements, like the prime number theorem, or that Margulis constructions really are expanders, or that the largest clique in a random graph on n vertices almost surely has size (2+o(1))log_2(n).

When mathematicians complain about things being too abstract, they usually mean something else: somebody is making them learn a machine/ a setup without properly motivating it. Or worse: the abstract setup abstracts the wrong things, or doesn’t quite do it right for some other reason.

I’m less charitable: when mathematicians say a concept is “too abstract”, I assume it means they don’t understand it yet but are embarrassed to say “I don’t understand it yet”.

This is based on my own self-perception.

If so, refusing to study mathematics that is “too abstract” is refusing to study mathematics one doesn’t understand yet — a rather dangerous stance.

(Seriously, I guess there are different reasons for not understanding things yet, and “too abstract” points to a different group than “too complicated”. As you said, we tend to feel something is “too abstract” if we hit a definition that seems unmotivated.)

On the other hand, I (and other math logic people) use the abstract/concrete distinction for mathematical statements measured according to a precise linearly ordered complexity hierarchy. There are theorems in math logic showing that as you move up, you get statements that are not provably equivalent to any ones lower down.

Also there are some major phase transitions involving the nature of Incompleteness from ZFC. This makes Concrete Mathematical Incompleteness a radically different subject than Set Theoretic Mathematical Incompleteness. I live off of that radical difference.

Mathematical logic is the branch of mathematics that arose out of the mathematical tools that were used to make the great advances in the foundations of mathematics. These basic tools that were so effective for f.o.m. became investigated for their own sake, usually without any clear foundational purpose. However, many of these tools and a considerable amount of the ensuing mathematical logic needs to become second nature if the subsequent challenges in f.o.m. are to be properly met.

It is true that the standard “rulebook” for mathematical rigor – standard f.o.m. – is almost always only in the background. This is because anybody capable of being a professional (pure) mathematician must absorb what must go into a rigorous proof as part of their second nature blood, in order to function reasonably. For this, they do not need to consult the “rulebook”. The “rulebook” has been implicitly absorbed by their teachers and the general mathematical atmosphere, since the “rulebook” is so so so intuitive and simple.

Nevertheless, the great importance of having this “rulebook” in the archives cannot be underestimated. There were periods of great confusion, reaching a head around 1800, before the great pieces of the “rulebook” were starting to emerge and work together properly during the 1800’s and early 1900’s.

Of course, the great David Hilbert needed a “rulebook” to even formulate his doomed program, (more or less) destroyed by Goedel. And the great further advances in Incompleteness depend on the “rulebook”. If you want to prove that something or other cannot be proved by rigorous mathematical argument, you are going to have to have a “rulebook”, even if you and your friends are not personally consulting it.

Of course, the great great great foundational issue is whether the rulebook allows us to do the interesting math that we are interested in. Experience shows that Incompleteness is most disturbing to people when it involves mathematical statements that people are strongly compelled to consider “matters of fact”. E.g., you can argue that the continuum hypothesis is not, is not clearly, matter of fact. That what sets of reals are is maybe subject to interpretation – i.e., relative to what means you have for constructing arbitrary sets of reals. Evidence for this kind of point of view might be that if you only consider Borel measurable sets of reals, then the continuum hypothesis is a theorem, with no foundational difficulties.

However, if the incompleteness involves only, say, finite sets of rational numbers, which ultimately means through encodings, only natural numbers, then it gets harder to question the matter of factness, and incompleteness becomes more serious.

Even clearer still would be say, sets of 1000-tuples of rational numbers of height at most 2^1000. There are only finitely many of these, and it gets hard to deny the matter of factness.

Now an interesting mathematical statement, or any statement, in this realm, is definitely going to be provable or refutable in ZFC by enumeration of all of the possibilities.

However, such a proof by enumeration is not going to literally be given in ZFC with any reasonable size. Reasonable size like the proof of FLT.

So such a result would cast serious doubt about the adequacy of ZFC at a new matter-of-factness level.

My small problem with this result is that if x is a non-standard element in a model of PA, then the interval [0,x] has an order-type which is not even an ordinal. If I well understand, one has a copy of N, followed by a set of copies of Z, which is densely ordered, followed by those elements in the actual copy of Z which are less or equal then x. And these are the time moments of the Turing Machine, needed to compute f(n) for standard n. So a fundamental property in computations, which is their Sequentiality, gets completely lost. Maybe I would prefer machines which run over ordinal times, instead of that. In this case I could define some behavior for a limit-ordinal step, although this case is also difficult, as we have the problem of the most natural definition…

So in conclusion we have a very nice result, but which is much more important for arithmetic as for computability.

As noted by me in an earlier comment, the concept of “Turing machine running for nonstandard many steps” does have some import, even for ‘standard’ mathematics and computability:

As it happens, this notion is useful in NSA (Nelson’s IST) as follows:

Call a total Turing machine “standard total” if for standard input it halts after standard many steps with standard output. Call it “standard partial” otherwise.

One can then prove theorems about “standard partial” objects, but this is more than a curiosum: From such theorems, one can extract relative computability results from fragments of IST using the term extract algo in [1].

Reverse math results concerning the Gandy-Hyland functional
(not involving NSA) have been obtained in this way in [2].

So the classical TM gives the result f(n) (for n standard) after proceeding steps indexed by the set [0, x], where x is a non-standard element in a model of PA. My Little problem with this concept is the order type of the set [0, x]. This “interval” consists of a copy of N, followed by densely ordered copies of Z, followed by those elements in the actual copy of Z, which are less than or equal x. This means that an essential property of “computations”, which is their sequentiality, gets completely lost. Apparently this fact makes the result less important for the theory of computability. But it is however nice and important for non-standard arithmetic.

How To Write Math Here:

You need the word 'latex' right after the first dollar sign, and it needs a space after it. Double dollar signs don't work, and other limitations apply, some described here. You can't preview comments here, but I'm happy to fix errors.