Friday, November 24, 2006

For my birthday a few weeks ago (40!) I received an 8" dobsonian telescope from my wife and her family. Of course the weather here in Northern California suddenly took a turn for the worse immediately after I brought it home, but there have still been a few clear nights.

One of the frustrating things about astronomy is that what you can see with an amateur telescope from your back yard in a well lit city is quite different from what you can see from the wilderness with a telescope the size of a truck. Unfortunately, even a great book like The Backyard Astronomer's Guide is dominated by the latter type of picture (like that fantasic picture of the Andromeda galaxy on its cover). So I thought I'd tell it like it really is and give you an idea of what you can see from my location.

How Telescopes Work

Before that, I want to mention something about telescopes. My telescope is a Newtonian reflector and if you look at the pictures on Wikipedia you'll notice that the secondary mirror forms an obstruction that blocks some of the light entering the telescope. A frequently asked question is "why can't you see the obstruction through the scope"? In fact, if you put your hand in front of the aperture the effect on what you see, if the image is in focus, is negligible, just a little darkening. The explanation is very simple, but nobody seems to phrase it the same as me.

The idea is this: a lens, that's in focus, is a device that converts direction into position. Suppose the telescope is oriented along the x-axis. Think of a ray of light as having the equation y=mx+c. Think of m as the ray direction and c as its position. If that ray passes through a lens or mirror and is projected onto a screen, denote the position at which it arrives on the screen by f(m,c). It's clearly a function of m and c. The crucial point is that for a screen at precisely the focal length away from the lens or mirror, f(m,c) is independent of c. That's the raison d'être of a telescope. All rays with gradient m arrive at the same point. So we end up with a bright image because we can collect lots of rays coming from the same direction. But also, any information about the ray's position, ie. c, has been erased. Information about the shape of the obstruction is positional information contained in c. This has been erased, and hence you can't see the obstruction. In practice you really need to consider primary lens (or mirror), plus eyepiece, plus lens of the eye projecting onto the retina, but the principle is the same. For the more formally inclined, the erasure of c information corresponds to a zero in a transfer matrix.

In brief: a lens or mirror focussed on infinity is a position eraser.

And What You Can See Through Them

Andromeda Galaxy

My first two subjects were probably the same as every other amateur astronomer. I started with the Andromeda galaxy. Through binoculars it looks like a faint fuzzy blob. And here's the truth of how it looks through a telescope: it looks like a fuzzy blob, at least from a well lit city. I couldn't make out any kind of structure at all, let alone spiral arms. It didn't even look elliptical, just a circular blob with brightness falling off radially from the centre. I also saw one of its neighbours, M32 or M110. At first I thought that the fuzzy blob was the galaxy and that I needed more magnification to see detail. But now I think that I was seeing just the core of the Andromeda galaxy with the arms filling a wider area but remaining invisible because they're no brighter that the sky in my part of the world.

Orion Nebula

The Orion nebula, on the other hand, was stunning! In outline, it looked remarkably like the picture at wikipedia, but without the colour. When I switched to using a filter (a narrow bandpass Lumicon UHC filter) it became even clearer. Not only could I clearly see the shape of the nebula but I could also see structure all the way through the cloud. It looked even better after I'd been viewing it for a while, probably as my eyes became better adapted to the dark. This was the best thing I've seen through a telescope ever!

Almach

I'll mention the last thing I looked at: γ Andromedae, otherwise known as Almach. It's a binary star system that can't be resolved with the naked eye. I used a wide field of view lens and was disappointed to see that it still looked like a single star. But then I switched eyepieces and I saw it - a beautiful pair of stars, one brighter and yellowy orange and the other deep blue. You read about the colour of stars but it's often a disappointment. A single star on its own tends to look white with a hint of colour because the eye doesn't register colour well in such low light conditions. But when you see two contrasting stars so close together it makes a world of difference. The deep blue of the smaller star was unmistakable and the blue really was astonishingly blue. That must be quite a sight for the γ Andromedans. But I should add that the blue star is in fact a triple star system in its own right, so γ Andromedae is actually a quadruple star system. I was unable to resolve more stars than two (and you can only tell there are four by inference from spectrography, not direct viewing).

I also tried to find the Crab Nebula. I'm 99% sure that I was pointing at the right part of the sky and I did in fact see a faint smudge just at the limits of my perception. I had a jiggle the telescope around a bit just to be sure I wasn't imagining it, but sure enough, it appeared to be attached to the sky, not the scope or my eyes. But it certainly didn't look like the pictures.

Monday, November 20, 2006

I've run 3 miles, the Thanksgiving turkey's in the oven, now I just need to do something impossible before I can have breakfast.

As I've mentioned in the past, sometimes you can write useful Haskell code merely by writing something that type checks successfully. Often there's only one way to write the code to have the correct type. Going one step further: the Curry-Howard isomorphism says that logical propositions corresponds to types. So here's a way to write code: pick a theorem, find the corresponding type, and find a function of that type.

One area that seems fruitful for this approach is modal logic. The axioms of various modal logics correspond to the types of familiar objects in Haskell. For example the distribution axiom:

So I'm looking at the books on my shelf and there's The Logic of Provability by Boolos. It's about a kind of modal logic called provability logic in which □a roughly means "a is provable". One of the axioms of this logic is a theorem known as Löb's theorem.

Before getting onto Löb's Theorem, I should mention Curry's Paradox. (Before today I didn't know Haskell Curry had a paradox associated with his name - though I've met the paradox itself before as it got me into trouble at (high) school once...) It goes like this:

Let S be the proposition "If S is true, Santa Claus exists".

Suppose

S is true.

Then

If S is true, Santa Claus exists.

So, still assuming our hypothesis, we have

S is true and if S is true, Santa Claus exists.

And hence

Santa Claus exists.

In other words, assuming S is true, it follows that Santa Clausexists. In otherwords, we have proved

If S is true then Santa Claus exists

regardless of the hypothesis.But that's just a restatement of S so we have proved

S is true

and hence that

Santa Claus exists.

Fortunately we can't turn this into a rigorous mathematical proof though we can try, and see where it fails. In order to talk about whether or not a proposition is true we have to use some kind of Gödel numbering scheme to turn propositions into numbers and then if a proposition has number g, we need a function True so that True(g)=1 if g is the Gödel number of something true and 0 otherwise. But because of Tarski's proof of the indefinability of truth, we can't do this (to be honest, the argument above should be enough to convince you of this, unless you believe in Santa). On the other hand, we can replace True with Provable, just like in Gödel's incompleteness theorems, because provability is just a statement about deriving strings from strings using rewrite rules. If we do this, the above argument (after some work) turns into a valid proof - in fact, a proof of Löb's theorem. Informally it says that if it is provable that "P is provable implies P" then P is provable. We did something similar above with P="Santa Claus exists". In other words

□(□P→P)→□P.

So I'm going to take that as my theorem from which I'll derive a type. But what should □ become in Haskell? Let's take the easy option, we'll defer that decision until later and assume as little as possible. Let's represent □ by a type that is a Functor. The defining property of a functor corresponds to the theorem □(a→b)→□a→□b.

> loeb :: Functor a => a (a x -> x) -> a x

So now to actually find an implementation of this.

Suppose a is some kind of container. The argument of loeb is a container of functions. They are in fact functions that act on the return type of loeb. So we have a convenient object for these functions to act on, we feed the return value of loeb back into each of the elements of the argument in turn. Haskell, being a lazy language, doesn't mind that sort of thing. So here's a possible implementation:

> loeb x = fmap (\a -> a (loeb x)) x

Informally you can think of it like this: the parts are all functions of the whole and loeb resolves the circularity. Anyway, when I wrote that, I had no idea what loeb did.

So here's one of the first examples I wrote to find out:

> test1 = [length,\x -> x!!0]

loeb test is [2,2]. We have set the second element to equal the first one and the first one is the length of the list. Even though element 1 depends on element 0 which in turn depends on the size of the entire list containing both of them, this evaluates fine. Note the neat way that the second element refers to something outside of itself, the previous element in the list. To me this suggests the way cells in a spreadsheet refer to other cells. So with that in mind, here is a definition I found on the web. (I'm sorry, I want to credit the author but I can't find the web site again):

With these definitions we can add, multiply and negate Num valuedfunctions. For example:

> f x = x*x> g x = 2*x+1

> test2 = (f+g) 3

Armed with that we can define something ambitious like the following:

> test3 = [ (!!5), 3, (!!0)+(!!1), (!!2)*2, sum . take 3, 17]

Think of it as a spreadsheet with (!!n) being a reference to cell number n. Note the way it has forward and backward references. And what kind of spreadsheet would it be without the sum function? To evaluate the spreadsheet we just use loeb test. So loeb is the spreadsheet evaluation function.

Now don't get the wrong idea. I'm not claiming that there's a deep connection betwen Löb's theorem and spreadsheet evaluation (though there is a vague conceptual similarity as both rely on a notion of borrowing against the future). Provability logic (as defined by Boolos) is classical, not intuitionistic. But I definitely found it interesting the way I was led directly to this function by some abstract nonsense.

Anyway, happy Thanksgiving!

PS There are other uses for loeb too. Check out this implementation of factorial which shows loeb to be usable as a curious monadic variant of the Y combinator:

Saturday, November 18, 2006

A while back I mentioned that I recently found out that Heaviside was responsible for a bunch of mathematical techniques I've known since my training for the Cambridge entrance exam. I decided to read more about Heaviside and I've just finished this book on the Victorian mathematical physicist, Oliver Heaviside. There's a bit of information about Heaviside on the web, but I thought I'd mention two highlights from this book that may hint at why he was a genius ahead of his time.

Operational Calculus and Distortionless Transmission

There's an example of Heaviside style operational calculus in the link I posted to above. One of the reasons I became interested in this subject again is that I was getting into electronics and I wanted to simplify computations of properties of simple linear circuits. I had this crazy idea that capacitors and inductors could be treated as resistors whose resistance is differential operator valued. Turns out that this wasn't an original move. This is exactly what Heaviside did well over 100 years ago and it was the secret weapon he used for much of his work. He could solve a wide array of ordinary and partial differential equations with ease. Very briefly, his idea was to write the differential operator d/dx as the symbol p and then treat p much like a conventional algebraic variable. He turned differential equations into ordinary algebraic equations.

A great example of this was when he studied the electrical signal that emerges from a long cable as a function of what was sent into the other end. If W is the outgoing signal, and V is the incoming signal, he showed that in his model, W = √(A+Bp)/√(C+Dp) V, for some constants A, B, C and D, that depend on the properties of the cable. At first sight this is meaningless - what is the meaning of the square root of a differential operator? Heaviside had ways to deal with these things, but that's not what he did here. He noticed that if he picked A, B, C and D such that A/B=C/D then he could cancel the p from top and bottom. The net effect was that if this condition held, the signal emerging was the same as the signal entering (apart from a time delay). In physical terms this meant that adding inductance to a long cable would allow it to carry the signal without distorting it. His contemporaries had been declaring long-distance telegraphy impossible because inductance would distort the signal, but here was Heaviside suggesting that inductors be added. The British Post Office ignored Heaviside's claims and it was left to a physicist in the US to put his ideas into practice - ideas that today formed the backbone for the nascent global telecommunications industry. Heaviside couldn't even get much of his work published because mathematicians like Burnside (boo! hiss!) rejected it as unrigorous. Needless to say, Heaviside died a bitter neglected old man...

Foreshadowings of Special Relativity

I'm fascinated by some of the theoretical clues about relativity that were appearing before Einstein. There were obvious results like the Michelson-Morley experiment and the Lorentz-Fitzgerald contraction proposed to explain it. But there were clues in other places too. HG Wells, in The Time Machine, said "There is no difference between Time and any of the three dimensions of Space except that our consciousness moves along it", so we already have a popularisation of the idea of a symmetry between space and time. Heaviside spent much of his time working with Maxwell's equations (which should really be called Heaviside's equations) which inherently has Lorentz group symmetry. This means that any physical predictions made from Maxwell's equations must also have Lorentz invariance. As nobody had explicitly recognised this as a symmetry of nature at the time, it meant for some unusual seeming results. For example, at the time Heaviside was working, the notion that the electromagnetic field stored energy was becoming popular. Heaviside compared the field of a static charge and a moving charge and noticed that for the same charge, the latter stored more energy. This meant that to accelerate a charge required putting extra energy into it which would go into the field. In other words, a charge should feel like it has more mass than it has. The apparent mass contained a familiar 1/√(1-v²/c²) factor and so he noticed that this mass increase grew as the charge's velocty approached that of light. In particular he noticed that the mass would become infinite at the speed of light, exactly as predicted by Special Relativity. Heaviside was never deterred by anything as trivial as an infinity so he went on to study the properties of superluminal particles and predicted and derived the properties of what should be called Heaviside radiation.

(BTW When Heaviside tried to study the geometry of the field around a moving spherical charge he initially made a few mistakes that were eventually fixed by someone else using Heaviside's own methods correctly. One thing that was noted was that the spherical symmetry was flattened. Yet another hint of Lorentz-Fitzgerald contraction.)

I'd love to also say something about Heaviside's battles with Preece and Tait because they are highly entertaining. But instead, I just recommend reading the book for yourself.

Sunday, November 12, 2006

I've decided that the Yoneda lemma is the hardest trivial thing in mathematics, though I find it's made easier if I think about it in terms of reverse engineering machines. So, suppose you have some mysterious machine. You know it's a pure functional Haskell machine (of course) with no funny stuff (no overlapping or incoherent instances or anything like that [1]).

The machine works like this: for some fixed type A, whenever you give it a function of type A -> B it gives you back an object of type B. You can choose B to be whatever type you like, it always works. Is it possible to reproduce the machine exactly after testing it just a finite number of times? Sounds impossible at first, it seems the machine could do just about anything.

Think about how this machine could work. You can choose B freely, and whatever B you choose, it needs to come up with an object in B. There is no way to do this uniformly in Haskell without doing funny stuff. (I'm ruling undefined to be funny stuff too.) So how could this machine possibly generate a B? There's only one possible way, it must use the function of type A -> B to generate it. So that's how it works. It has an object a of type A and when you hand it an f it returns f a. You should also be able to convince yourself that there's no way it could vary the a depending on what f you give it. (Try writing a function that does!) Having narrowed the machine's principle down, it's now easy to figure out what a the machine is using. Just hand it id and it'll hand you back a. So in one trial you can deduce exactly what the machine does (at least up to functional equivalence).

We can specify this formally. The machine is of type: forall b . (a -> b) -> b. The process of extracting the a from the machine, by giving it the identity, can be described by this function:

> uncheck1 :: (forall b . (a -> b) -> b) -> a> uncheck1 t = t id

Given the output of the uncheck1 function, we can emulate the machine as follows:

> check1 :: a -> (forall b . (a -> b) -> b)> check1 a f = f a

You're probably wondering why the functions are called these names. See footnote [2] for that. I'll leave it to you to prove that check1 and uncheck1 are inverses to each other.

But now there's another machine to consider. This one takes as input a function A -> B and gives you back, not just one B but a whole list full of them. Maybe you're already guessing how it works. If it's generating a bunch of objects of type B then it must surely have a bunch of A's and it must be applying your function f to each one. In other words, the machine's behaviour must be something like this

That has the correct type signature but it doesn't seem to have the same form as machine2. However, with a tiny bit of work we can show it's functionally equivalent to one that does. In fact we can just plug machine2' into uncheck2 and it will give us a list of A's that can be used in machine2. Instead of reverse we could use any function [a] -> [a] and we'd still get a sensible result out of check2. The reason is that if f is of type forall a.[a] -> [a] then f $ map g a equals map g $ f a. (This is a Theorem for Free!.) So we can rewrite machine2' as

which looks just like our machine2. So however we munge up our list to make our machine unlike machine2 we can always 'commute' the munging to the right so it acts on the internal list of A's, converting into a machine like machine2.

One last example:

This time we hand our machine a A -> B and it gives us back another function, but this one is of the type C -> B, for some fixed C. It modifies the 'front end' of the input function so it can take a different argument. How could that possibly work? There's one obvious way: internally the machine is storing a function C -> A and when you hand it your function it returns the composition with the function it's storing.

uncheck3 pulls extracts the internally represented c -> a and check3 makes a functionally equivalent machine out of one.

So...I hope you're seeing the pattern. To make it easier, I'll define some functors:

> data I a = I a> instance Functor I where> fmap f (I a) = I (f a)

> instance Functor ((->) a) where> fmap f = (.) f

Now all three example machines have the same form. For some functor f they map a function A -> B to an object of type f B and we deduce that internally they contain an f A. We can now write out versions of check and uncheck that work for all three machines:

The above examples follow when we consider the functors I, [] and ((->) c) (for various values of c) respectively.

Yoneda's lemma is essentially the statement that check and uncheck are mutual inverses. So if you understand my examples, then you're most of the way towards grokking the lemma.

At this point I should add some details. We're working in the category of Haskell types and functions Hask. Expanding out the category theoretical definition of a natural transformation, t between two functors f and g in Hask gives t . fmap f == fmap g . t. In this category, natural transformations correspond to polymorphic functions between functors with no funny stuff so this equality actually comes for free. (To be honest, I haven't seen a precise statement of this, but it's essentially what Theorems for Free! is about.) Yoneda's lemma actually says that for all functors f there is an isomorphism between the set of natural transformations of the type forall b . (a -> b) -> f b and the set of instances of f a. So now I can give proofs:

So that's it, Yoneda's lemma. It's trivial because the isomorphism is implemented by functions whose implementations are a couple of characters long. But it's hard because it took me ages to figure out what it was even about. I actually started with examples outside of Haskell. But Haskell has this weird property that polymorphic functions, with minor restrictions, are natural transformations. (I think this is the deepest mathematical fact about Haskell I've come across.) And as a result, Hask is an excellent category in which to learn about Yoneda's lemma.

I also recommend What's the Yoneda Lemma all about? by Tom Leinster. His presheaf example is the one at which these ideas started making sense to me - but that's because I've spent a lot of time playing with Čech cohomology on Riemann surfaces, so it might not work for everyone. This comment is also worth some thought. In fact, is the Yoneda lemma itself a Theorem for Free?

I haven't said anything about the deeper meaning of the Yoneda lemma. That might have something to do with the fact that I'm only just getting the hang of it myself...

NB Everything I've said is modulo the equivalence of natural transformations and polymorphic unfunny functions. I may have got this wrong. If so, someone please correct me as I'm sure everything I say here will still hold after some minor edits :-)[1] Consider the following compiled using GHC with -fallow-overlapping-instances -fglasgow-exts:

> class Test a where> f :: a -> a

> instance Test a where> f = id

> instance Test Int where> f x = x+1

f is the identity for everything except for objects of type Int. This is an example of what I call "funny stuff".

[2] The accent on this letter 'č' is called a caron or háček. The book from which I learned about the Yoneda lemma used the caron to indicate the function I call check. I called it that because the TeX command to produce this symbol is \check. This is a multilayered pun, presumably by Knuth. It could just be that 'check' is an anglicised abbreviation for háček. But it's also a characterisically Czech accent so it's probably also an easier (for English speakers) spelling for 'Czech'. And I think it's also a pun on Čech. The caron is used on an H to represent Čech cohomology and so it's also called the 'Čech' accent. (I hope you can read those characters on a PC, I wrote this on a Mac.)

Saturday, November 04, 2006

...monad, of course. I keep telling myself that I'll write a post on topology or history of mathematics or something and then I end up saying something about Haskell monads. It seems to have happened again.

Anyway, it suddenly dawned on me that there's a monad to be wrung out of any recursive data structure. Roughly it goes like this: take a tree-like type, Tree a with some leaves are of type a. Visualise, if you will, an object of type Tree (Tree a), ie. a tree with leaves are themselves trees. You can graft the leaves that are trees directly into the parent tree giving you a Tree a. So grafting gives you a map Tree (Tree a) -> Tree a. That should remind you of part of the definition of a monad - this is the function called join in Haskell, or μ by category theorists. Note, for example, that grafting grandchildren into children before grafting the children into their parents gives the same result as grafting the children before the grandchildren. These, and other similar observations, give the usual monad laws.

Think of V b as a variable whose name is of type b. These are the points at which we'll graft in the subtrees. In this case, grafting will be the well known operation of substituting the value of a variable for the variable.

To apply >>= f you recursively apply it to the children except for V b where you actually get to apply f. So now we can define an 'enviroment' mapping variable names to values and construct an example tree e:

> env "a" = I 1> env "b" = V "a" :+ I 2> env "c" = V "a" :+ V "b"

> e = V "c" :+ I 3

It should be easy to see that e >>= env substitutes the value of c for V "c". But it's more fun to write this as the slightly bizarre looking:

> subst1 env e = do> a <- e> env a

> test1 = subst1 env e

We can read "a <- e" in English as "drill down into e setting a to the name of each variable in turn". So this is a backtracking monad like [] with a taking on multiple values. But notice how it only does one level of substitution. We can do two levels like this

> subst2 env e = do> a <- e> b <- env a> env b

> test2 = subst2 env e

With each line we drill down further into e. But of course we really want to go all the way:

> subst3 env e = do> a <- e> subst3 env (env a)

> test3 = subst3 env e

And so now I can confess about why I wrote this post. subst3 looks like a non-terminating loop with subst3 simply calling itself without ever apparently checking for termination, and yet it terminates with the correct result. Well I found it mildly amusing anyway.

Anyway, this >>= is a generalised fold meaning there's an F-algebra lurking around. Armed with the words F-algebra, monad and tree I was able to start Googling and I found this which points out that the connection between trees and monads is "well known".

Oh well, I've got this far, I may as well finish the job. That paper is about dualising the above construction so I'll implement it. This time, instead of considering trees where each node is optionally replaced by a label, we now have trees where every node has a label as well as a subtree. We call these labels decorations. In this case I think I'll work with trees are hierarchical descriptions of the parts of something (in a vain attempt to pretend this stuff is actually practical :-)

We'll consider parts of a plane. We'll decorate the parts with their price. The function total computes the total price of a tree of parts. cobind now extends total so that every part is (unfortunately, inefficiently) redecorated with its total price including that of its subparts. If you're having trouble visualising comonads, this seems like a nice elementary example to think about.

Thursday, November 02, 2006

I just thought I'd quickly mention that Melvyn Bragg has again chosen mathematics as the subject matter for his Radio 4 programme In Our Time. This time it's on the Poincaré conjecture. Again I'm amazed. Given that this is a series about all aspects of human culture, I think mathematics is actually being well represented. Downloadable as an mp3 from the BBC web site or through iTunes as a podcast.