Wednesday, October 19, 2005

As mentioned a while back, this is an article on "Alternative Logic" I wrote for K5 a while back. I've made some tiny changes to it, in particular some claims that probably reflect my frustration with understanding some logic texts more than anything else. It's intended as a "pop science" level article on alternatives to classical logic. I'm in no way an expert on logic but I think it's mostly correct. I do have some worries about my explanation of an intuitionistic proof but I provide links for people who want a more accurate picture. I just wrote this for fun.

Introduction

In a previous installment you've seen some of the alternatives to numbers that exist. But can we do something even more radical? How about going to the very foundations of mathematics itself and replacing logic itself? That might sound a little unusual, I mean logic's logic, you can't choose what you want to be true can you? Well actually there are plenty of alternatives to familiar classical logic, so many that I can't hope to do justice to them in one article. Nonetheless, I can still try to throw out a few teasers and hope you are motivated to check out the references and links to see what I'm really talking about.

Before talking about alternative logics I must talk a little about classical logic, the familiar logic used by mathematicians, philosophers and others. What do we mean by logic? In the old days it simply meant reasoning correctly. That was pretty vague. These days the subject has been formalized and there are two aspects to consider.

On the one hand consider the sentences "All men have less than 3 legs" and "John is a man". From this we can conclude "John has less than 3 legs". Note the very important fact that we don't really need to know what men or legs are to make that deduction. If we were told "All X are P" and "x is an X" then we know "x is P" regardless of what x, X and P are. So we can think of deductions as mechanical operations on sentences. In fact we can consider sentences to be nothing more than strings of characters and proofs to be strings of sentences where each sentence follows from some previous ones by application of a rule. These rules are called deduction rules. Of course we also need some starting points to which we can start applying rules. Those are called axioms. An example deduction rule is the one that says if you have derived A and have derived "A implies B" then you can derive B. In fact that rule has a name, modus ponens. If you can derive a statement from the axioms then the derivation is called a proof. Sometimes, however, you may hypothesise something without being sure of it. You can still apply the deduction rules to get a derivation but it's not a proof until you can eliminate the hypothesis. In fact some systems of logic have very convenient rules for eliminating hypotheses.

On the other hand - what does it mean for a sentence to be true? Well for a sentence like "Fred has red hair" you can find out by checking directly. But what about compound sentences like "Fred has red hair and Jane has black hair". Here are two propositions conjoined by "and". What you can do is test each sentence individually and assign a truth value: 0 if it's false and 1 if true. In fact we can define a function v (called a valuation), acting on sentences, that maps them to 1 if true and 0 if false. We can then use the binary operation, &, defined by the following table, to determine the truth value of the conjunction:

& |0 1

--+----- eg. 1 & 0 = 0

0 | 0 0

1 | 0 1

| | 0 1

--+----- eg. 0 | 1 = 1

0 | 0 1

1 | 1 1

~ | 0 1

--+----- eg. ~0 = 1

. | 1 0

If the result is 1 then the conjunction of the two sentences is true. Formally we're saying v(A and B)=v(A) & v(B). This is taking a truth theoretical view ie. that the truth of "A and B" depends on the truth value of A and the truth value of B. Similary we test the truth of "A or B" using the | operator so v(A or B)=v(A) | v(B) and v(not A)=~v(A). These operations are examples of logical connectives and the functions &, | and ~ are collectively known as boolean operations.

There is one more logical connective to mention: implication. v(A implies B) is v(A) -> v(B) where A -> B is defined to be ~(A & ~B). In other words to tell whether or not "A implies B" is true we look at the truth of A and B and the result is false only if A is true and B is false. Note this means that if we can find any inconsistency in our logic, ie. any proof of something provably false, then we can prove anything, because a false statement implies anything and using modus ponens we can then deduce anything we want. So it just takes one little inconsistemcy to bring the whole system crashing down.

By the way, sometimes I'll be a bit lazy and, for example, use & and 'and' interchangeably. Really they are different things: one combines sentences and the other combines truth values, but it should be clear what is going on from the context.

I should also mention the quantifiersfor all and there exist. They also have a truth theoretical interpretation but it's quite technical so I'll leave it out.

So we have two sides to logic: the syntactical side which involves proving things by the application of mechanical rules applied to sentences and the semantic side which is about how you decide if a sentence is actually true or valid. But does proving something make it true? Well if this is the case for a type of logic it's said to be 'sound'. And if everything true is also provable it's said to be 'complete'. When logicians consider alternative ways of doing logic soundness and completeness are among the first things they look for. And while there is good reason for sticking with logics that are sound, it is well known that there are many logical systems that are incomplete. However I'll leave discussion of Gödel's incompleteness theorems for another time.

Note. My notation is a little nonstandard - I am limiting myself to a subset of HTML that I think is viewable by most browsers. If your browser is text only you may be having trouble with even this limited subset.

Multivalued Logic

How can we modify logic then? Well there's an obvious thing to do to start with. I mentioned the truth values 0 and 1. How about throwing in some alternative values? 1/2 seems like an obvious choice. This is exactly what Lukasiewicz did in around 1917. One of his motivations was that he was able to sidestep Russell's paradox with it. But if you were following what I was saying above then that's just half the story - the semantic side. So in the thirties an axiomatization (ie. a set of axioms and deduction rules) for this system was also devised introducing new connectives corresponding to necessity and possibility. We can define new logical operators called M and L with M(1/2)=1 and L(1/2)=0. If X is a proposition MX can be interpreted as meaning "maybe X" and LX as "necessarily X".

But there are some problems with this kind of logic. Lukasiewicz defined ~x=1-x so ~(1/2)=1/2. He also defined 1/2 | 1/2 as 1/2. So we find that for some X, X | ~X is 1/2. To many people that simply doesn't feel right, they argue that X | ~X should have a truth value of 1.

Pressing on regardless we can generalize even further to allowing truth values to take any number between zero and one. One way to to this is with Fuzzy Logic but another approach is via Post logics. If you really want to you can even consider probability theory as a branch of logic where probabilities are real-valued truth values.

Relevance Logic

Now consider what I said above about implication. I basically said that implication has a truth theoretical interpretation - that you could determine the truth value of "A implies B" by knowing the truth values of A and B. But does that really capture what is meant by "implies"? If "A implies B" really were the same as "not both A and not B" then any false proposition would imply anything you like. For example, my car has air bags. So according to classical logic it's correct to say "if my car had no air bags the moon would be made of cheese". But what does my car have to do with the moon? There seems to be something fishy about a system of logic that allows you to make deductions from premisses that are completely irrelevant. So some logicians take issue with the idea that "A implies B" should be defined truth theoretically. In fact, some logicians argue that it's so obvious that you can't define implication truth theoretcially and that only several years of brainwashing (eg. a university course in mathematics) could convince someone otherwise.

This is where relevance logics come in. Relevance logic (pioneered by Dunn) attempts to define conditions under which A is relevent to B, for example by tagging the explicit use of A in the demonstration of B from A. There is an interesting consequence of saying that A has to be relevant to B to prove B. We might find that even if we have managed to prove one false statement we can't prove some others. This is different to what I said about classical logic above where just one false statement brings the whole system crashing down. So relevance logics can be robust against inconsistency making them what's called paraconsistent. Some logicians consider this to be an advantage. For example, at several points in history physicists have had inconsistent laws of physics (for example just before a scientific revolution) to work with and yet have still reasoned succsfully. But there are also other approaches to logic that deal with this issue.

Intuitionistic Logic

You may remember I ended the Alternative Numbers article with a quote from Kronecker. He had a hard time believing in some of the constructions mathematicians were using. In particular he disliked the Law of the Excluded Middle (LEM) which says that either X or not X. In other words that either a proposition is true or its negation is true. There's no middle ground between these two possibilities. It's often used by mathematicians like this: suppose I'm having a bit of trouble proving X. Instead I can suppose that not X is true and see where that gets us. If I find that it leads to a contradiction then I know something is up because in classical logic it is assumed that you cannot prove a false statement. If I did every step of the derivation correctly then the original hypothesis must have been false. There is no "middle way" where the original hypothesis might have been half true. If it leads to contradiction it must have been false. (This is known in the trade as Reductio ad Absurdum.) But here's the issue for people like Kronecker: X might be a proposition saying that a certain mathematical object exists. Using LEM you might be able to prove it exists without actually showing how to find it. It's a bit like getting something for nothing and mathematicians use it all the time. But Kronecker didn't like it at all, it seemed like cheating. He wasn't alone.

Another mathematician, Brouwer (probably more famous for his fixed point theorem), also had trouble with LEM, and he started devising an alternative form of logic called Intuitionistic logic. Actually Brouwer was dead against formalism of the type that logicians use, but nonetheless someone came along after Brouwer and formalized his logic! It turned out that this led to a particularly elegant type of logic.

Part of the intuitionist's problem is that it's possible to prove "A or B" without having a proof of A or a proof of B. To an intuitionist a proof of "A or B" is a proof, either of A, or of B. In fact - that's just how intuitionists define a proof of "A or B". When you do this it suddenly become a lot harder to 'cheat'.

Consider what is meant by "A implies B". To a classical logician it's just "not (A and not B)". But this just doesn't seem to capture the "intuition" of what implication means. A better idea might be something along these lines: if you know A implies B, then as soon as you know A then you know B. In other words the statement "A implies B" should give you the means to prove B the moment you've proved A. So you can imagine "A implies B" as a recipe for converting a proof of A into a proof of B. In fact this is exactly how intuitionists view implication. A proof of "A implies B" is a function that takes as input a proof and spits out another. If the input is a proof of A then the output should be a proof of B. This is radically different to the classical view. By making implications a type of function we connect logic with set theory and with subjects like lambda calculus. And there's another really cool thing going on: mathematicians use the symbol -> in many different ways. Among them are two: they use it to mean both implication and a function mapping (eg. f:A->B means f maps from set A to set B). In other words, by strange coincidence we've turned a proof into a function and we use the same symbol for both. It turns out there is a whole branch of mathematics that deals with arrows! It's called category theory and it provides a beautiful integration of everything I have mention in the last two paragraphs.

But before you take up intuitionist logic you should be warned. Brouwer had carried out some great mathematical work in his life. But when he became an intuitionist he regarded much of his work as being no longer of value. He had to reject things that other mathematicians thought were obviously true. Eventually other mathematicians came to see Brouwer as a little crazy! On the other hand intuitionist logic is closely tied up with Constructive Mathematics (not to be confused with a similarly named educational fad) which insists that you always construct things you claim exist. This means that it goes hand in hand with computer science where people's programs are usually expected to produce a result - not just say a result exists.

Second and Higher Order Logic

Earlier I briefly mentioned quantifiers. An obvious question about them is this: when I say "for all x", what kinds of x are we talking about? If you answer this question then you know what the basic 'objects' are that your logic deals with. For example, in mathematics we generally deal with sets and when we say "there exist an x such that..." we really mean "there exists a set x such that...". In fact mathematicians generally build everything out of sets using a set of axioms called ZF (and maybe some more axioms too - like the Axiom of Choice).For example, as I mentioned in the Alternative Numbers story, mathematicians build the ordinal integers out of sets.

But what about properties of sets or numbers? How do we talk about those as opposed to the underlying sets themselves? Mathematicians have several different ways to talk about properties. For example, if you want to talk about primality you can just talk about the set of prime numbers. As ZF can talk about sets, by considering primality through the set of primes we have just turned primality into something ZF can talk about. But sometimes there are properties we want to discuss where we can't form a set. For example consider the property of being a set. If we were to convert this into a set then we'd need to have a set of all sets.But ZF doesn't allow this (because of hairy problems like the Russell paradox). So how can we talk about a property like that?

One approach is to modify logic by extending our universe to include properties. This is what is known as Second Order Logic - as opposed to the usual logic that is first order.Using Second Order Logic you can say things like "there is a property such that if two sets have this property then so does their union". You simply can't say this in First Order Logic.But there are some difficult issues associated with First Order Logic. Logicians like to prove things, not just in their logical systems, but about their systems. It's actually very hard to prove anything about Second Order Logic. There are also some other tough issues like deciding when two properties are the same. In fact Quine believed that Second Order Logic isn't even Logic. And even though Second Order Logic appears to have greater expressive power mathematicians actually get along fine with First Order Logic. Every time they make a statement about properties there's always a workaround (sometimes quite inelegant) for expressing something that's just as useful in First Order Logic. By the way, you can also generalize further, so for example in third order logic we can talk about properties of properties.

Modal Logic

When I was talking about multi-valued logic I mentioned the concepts of necessity and possibility. These are called 'modalities' by logicians and a logic that deals with this kind of concept is called a modal logic. There are many different types of modal logic but I'll mention four: alethic logic, deontic logic, epistemic logic and temporal logic. An alethic logic is one like that mentioned above. Operators for necessity and possibility are introduced.These operators are very closely related in that if it is not necessarily the case that X is not true then X is possibly the case and vice versa. We can write this, using the above notation, as MX=~L~X. Let's think about the semantics of this for a moment. What does it mean to say Bush is a Democrat is false not not necessarily false? If it's false, it's false. Well when logicians are thinking about the semantics of modal logic they often work with 'possible worlds'. Although it is true that Bush is a Republican we can imagine a world in which he is a Democrat (well I think I can anyway). Only when something is true in all possible worlds is it considered necessarily true, otherwise it's just plain true (this is not strictly the definition used but it's close enough for now). Can you imagine a world in which 1=2?

Deontic logic deals with what you ought to do. Introduce the symbol O that means 'ought'.Then OX means X ought to be the case. Just like in alethic logic we can consider the 'dual' operator ~O~. ~O~X means It is not true that it ought not to be the case that X. In other words it means X isn't forbidden. Epistemic logic deals with belief and knowledge.For example KX means X is known to be true. KX->X but the converse isn't true. Among other things epistemic logic can be used to study cryptographic protocols in a formal way. Lastly temporal logic introduces time into logic with operators like H so that HX means hitherto X. In temporal logic whenever you ask about the truth of something you need to say at what time you are talking about. HX then means X is was always true up to the time specified. Here's an exercise: what does ~H~ mean in English?

Quantum Logic

Are our logical principles necessarily correct or are they an empirical fact that we have discovered about the universe? Once upon a time Euclidean geometry was seen as necessary. It just seemed obvious that parallel lines were always the same distance from each other. But along came Lobachevsky with the notion that there were different types of geometry and then came Einstein who showed that the universe itself was actually described by one of them. So maybe logic will go the same way. One philosopher who argued this was Hilary Putnam. Just as General Relativity changed geometry he said that Quantum Mechanics changes logic and that in fact we should be using Quantum Logic!

So how do you have to modify logic to make it Quantum? It's difficult to go into the details but I can give a taste of it. Instead of truth values 0 and 1 truth values are subspaces of vector spaces. If you don't know what that means I'll try to give an example. A vector space is a flat space that extends out to infinity and has a chosen point in it called the origin or zero.A subspace is a space contained in that space with the same chosen point. For example a 2D plane can be thought of as a vector space and the lines going through the origin are examples of subspaces. We can now define how the logical connectives work: given a valuation vv(A and B) = means the intersection of v(A) and v(B). v(A or B) means the smallest subspace containing v(A) union v(B). v(not A) means the space perpendicular to v(A) (in particular not(V), where V is the entire vector space, is the set {0} containing just the origin and not({0}) is the whole of V).

So doing logic is now doing geometry. The boolean & operation, for example, is replaced by vector space intersection. If a proposition turns out to have truth value {0} then we can consider it to be false and if the truth value is the whole of V then it can be considered true. But what about all those in-betweens? Well they correspond to those weird situations in quantum mechanics where you simply can't know two things at the same time. For example because of the Heisenberg uncertainty principle it simply doesn't make sense to say "this electron has mass m and momentum p". Quantum Logic perfectly captures these aspects of quantum mechanics.

(Aside for physicists: truth values aren't strictly the subspaces but the operators projecting onto these subspaces. A projection, P, has the property that P2=P. So the eigenvalues of P are 0 and 1. This means that Quantum Logic really is the quantization of logic - we have replaced truth values with non-commuting Hermitian operators whose eigenvalues are the results we want!)

So do physicists use Quantum Logic? Well, no actually. Part of the reason is that it's a victim of its own success. It works so well and fits perfectly with quantum mechanics that it doesn't actually say much that's new. So physicists get along fine without it. But philosophers of science do occasionally make use of it. (By the way, if you thought it was weird that truth values in Quantum Logic were geometrical, then maybe I'd better not mention that one way of defining a valuation for Intuitionistic Logic is through topology!)

Linear Logic and Non-monotonic Logic

Looks like I still have space to briefly mention some other types of logic. There are still some aspects of logic that have the potential to be varied. For example in every system we've looked at above we are able to take propositions and deduce further propositions from them.As you apply the deduction rules there is an ever increasing set of theorems that you have proved. Consider the propositions "all birds can fly" and "a robin is a bird". From these we may conclude "robins can fly". However, if we also have the proposition "a penguin is a bird" we might conclude that penguins can fly. It looks like the proposition "all birds can fly" needs to be amended. Yet it doesn't seem entirely right to completely throw away that proposition - it appears to have some value. One approach is through default reasoning. By default, in the absence of other information, birds fly. Logics that can handle this type of reasoning are called non-monotonic. The name comes from the fact that if you have more information, eg. penguins don't fly, you can end up being able to prove less, eg. you can no longer use the proposition "all birds can fly" in the default way. Default logic is of interest to practitioners of AI as it gives a way to codify some aspects of 'common sense'.

You can even 'lose' theorems in one form of logic - Linear Logic. Suppose you are describing a vending machine. You put in one dollar and get either a CokeTM or a SpriteTM. If you have one dollar you can get one or the other but not both. So the proposition "I have one dollar" can be converted into either "I have a CokeTM" or "I have a SpriteTM" but once it has you can't do it again unless you have another dollar. Linear Logic is something like this and certain types of deduction rule actually 'consume' their premisses as you use them. Linear Logic can be useful for describing systems with limited resources.

Laws of Form

Phew! We're coming near to the end. So to wind down I'm going to end with a very simple logic from the book Laws of Form by G Spencer-Brown. Now most mathematicians are pretty sure that this is the work of a crackpot, nonetheless the first part introduces a fun idea that people have managed to get mileage out of. To introduce this type of logic you really have to break the most fundamental rule of logic - that sentences be strings of characters.In this type of logic there is only one symbol, a circle. (Actually, Spencer-Brown didn't use circles but this notation is equivalent and is in a similar spirit.) It doesn't have to be a precise geometric circle, it could be a squiggly circle. This is known as a distinction.

As I can't use pictures I'll use ASCII art - the net effect being that I'll be using strings of characters anyway.

Here are some circles: () and o. What can you do with a circle? Well you can put one next to the other like this:

o o.

According to Spencer-Brown the rule is that whenever you see that you can replace it with o.I.e.

o o = o.

The other rule is that if you see one circle directly inside another they cancel.I.e.

(o) = .

That's a circle within a circle on the left. (Use your imagination!)Notice there's nothing at all on the right hand side of that '=' sign! Here are some example derivations:

((o)) = o and (((o)(o)o)o) = ((o)o) = (o) = .

Now you can throw in some unknowns and start proving theorems like thatfor any x and y, ((x))=x and yy = y. It turns out that you can represent propositional calculus using these symbols.Have fun doodling!

Final NoteI've had to make many sacrifices in order to make this article of a reasonable length. I've had to be a little vague or sloppy in places. So please use the references and links if you are interested in anything I say. Unfortunately I've probably committed the extra sin, in places, of being incorrect too.

4 comments:

Nice post! There's a typo where you talk about Bush's possibilities and say "not not" instead of "but not" (I think). And your aside about completeness seems to conflate the notion Godel showed didn't hold in the Incompleteness Theorems with the one he showed did hold in the Completeness Theorem. (ie, not every true statement is provable, but every consequence of any premises can in fact be derived from those premises - it's just that no set of premises is enough to get all the truths of arithmetic).

Ah yes...when I wrote that article I was a bit confused about 'completeness' and was trying hard to figure out how the two different notions were the same thing. Later I realised they aren't. Very confusing until someone explicitly points it out - as the books I was looking at failed to do.