Sam Hardwick's web journal

Tag: mathematics

My friend Vadim Kulikov defended his PhD thesis in mathematics this Saturday past. I hadn’t written anything down, but had had some thoughts circling around my head as to what remark I could make at the post-party (this is quite a big deal in Finnish academia). I did speak, and although it went on for longer than is ideal (even though I didn’t say everything I might have), it was so well-received I decided to give as good an account of it here as my recollection permits. I have made some improvements and additions, but I might also have forgotten something, so remind me if you were there and noticed an omission.

I was inspired to say something by the last heading in the Introduction in Vadim’s thesis, which all math students here should definitely read. It’s very motivational, describing the sense of fulfillment at not only having achieved something worth achieving, but also at having gained a truly deep understanding of something, in this case certain mathematical objects and ideas.

I got to thinking about what it is that has made Vadim progress faster and achieve more than most of his peers. Some things are obvious and indispensable: natural talent, a strong ability to work and concentrate, a deep love of mathematics and understanding, and some luck in having a suitable advisor. I believe these are sufficient to make a great student of mathematics, but something more is required to make a mathematician.

There is a concept in Zen buddhism, shoshin, meaning “beginner’s mind”. The saying is that “the beginner sees many things, the expert sees few things”. The beginner’s mind is empty and without preconceptions, so when the beginner encounters a thing, his mind is not filled with the few things he has been taught to think about it, but the totality of it.

For example, when appreciating a painting, the beginner sees a mass of brush strokes, a form, he might understand what the picture depicts – all in all, he is unguided and confused. As he gains knowledge, he starts to become an expert. He might identify the style of the painting and even the painter’s identity. He understands the use of various elements in the painting to signify ideas. He might know about the historical period in which the painting was created, and place the elements in that context. In brief, he gains the ability to see a painting, not be confused, have 5-10 thoughts about it and move on to the next one.

But at the highest levels of understanding, mastery, the Zen way is to have the beginner’s mind. The Zen master sees the mass of brush strokes. His mind is primed with every level of expertise, but it doesn’t force fixed ideas about what the painting is. It is full, yet empty.

There is actually some recognition of this in mathematics education at the university level. In many cases there is a simple way to solve an exercise if the student is aware of some higher-level theorem, without doing all the “boring” technical work you must do if you don’t know about the theorem. If a student presents such a solution, the lecturer will usually say “It’s nice that you know about this, but it would be better for you to do the problem without using this theorem, because it’s important to get to understand the internal details of these things.”

Vadim has a measure of this characteristic naturally, and I believe it is very valuable in doing creative work. One example of this is from when Vadim really had the beginner’s mind because he actually was a beginner. Vadim had come to our lukio (which the English might call a “sixth form college” and Americans might call “high school”) as a first year student, and I was a second year student. Vadim had already gotten enthusiastic about mathematics, but at his previous school there hadn’t been many other pupils with that interest, so he was happy to find a number of such people at our school. He was very eager to find problems to solve, so I told him to try to prove something during his next class; that every even number greater than two can be expressed as the sum of two prime numbers.

Well, as the mathematicians here have noticed, this is a famous open problem called Goldbach’s conjecture, so giving it to Vadim to solve was really just a practical joke on my part. “Someone’s too enthusiastic, let’s try to blunt his enthusiasm a bit.” Anyway, after the next class I asked Vadim how he’d gotten on and he said “I think I’ve almost solved it – I got the other direction, that when you add two primes you always get an even number.” I asked him, “What about 2 + 3?” “Oh, I forgot about that!”

When I revealed that the task was hopeless from the start, Vadim was not actually angry at me, or even all that deflated. To a beginner, all problems are open problems. Vadim even continued to think about the problem for some time, attacking it with whatever methods he knew about at that point.

So with the beginner’s mind there comes a certain fearlessness about open problems and unknown things. Vadim kept this more or less intact during his studies. In mathematics it’s important to have the “complete simultaneous understanding” of the (Zen) master and the open, fearless mind of the beginner, because you have to be able to transplant ideas from one part of mathematics to another part, understanding the internal details well enough to know what needs to be changed and what doesn’t. If you “see many things” like a beginner, you can have surprising ideas.

However, once he had begun work as a graduate student with Tapani [Hyttinen] and Sy Friedman, Vadim began to tell me about certain frustrations he was experiencing. Working with experienced mathematicians in their domain of expertise, it would always be they who had significant flashes of insight. Vadim could redeem himself by working out technical details, but time and again it felt like open problems could only be solved by these “oracles”. Instead of many things he was beginning to see only one thing, “this is an open problem so I can’t solve it”. In this way, the joy of mathematics and the creative spirit of the beginner’s mind was beginning to suffer.

So it was a great relief to Vadim (and a great pleasure for me to hear) when he had a major breakthrough completely on his own, and produced an idea that resolved an open problem. The angst of the open problem was swept away, and he could once again look at things with fresh eyes. So I’m happy he has passed through expertise to mastery of this field, retained shoshin, and hope that he’s able to keep it in other parts of life as well, leading to a fruitful life of many creative discoveries.

Inspired by tjic, a somewhat strange bedfellow, I’ve uncharacteristically decided to set myself a goal/resolution for 2011. I’ve set my sights a good deal lower than that fellow’s overcome-everything mindset (make money, build things, lose weight, learn instrument), but I’ll nevertheless be happy if I succeed. Heck, knowing me, I’ll most likely be happy even if I don’t! The goal is to make and publicly display 6 pictures a week for all of 2011. “Display” means “post here”, obviously. Turn off your RSS readers, gentlemen! I will probably do a post a week with the whole bunch. I hope to draw considerably more than 6 a week, but that’s what I aim to dare show and be bothered to scan and upload.

6 is probably more than I’ve done in a whole year most years. At the age of around 14-15 I had a period of sketching a fair bit, and again around 16-17 (I’m 25 now), so I’m very unpracticed and shoddy. In the latter part of 2010 I did some quick watercolours. But I do like drawing, a lot, and I hope this will improve my skill and make it even more fun. Perhaps more importantly, it would also be nice to get a habit of openness and willingness to reveal weaknesses in order to improve them.

My greatest weaknesses in drawing currently are probably in shading, perspective (I can’t think spatially for shit) and patience. I’ll probably start out by not attempting those, and attack them with more practice later.

Before I move on to show-and-tell, an anecdote about spatial thinking. This Thursday I was thinking about this Project Euler problem (by the way, skip this paragraph if you don’t want to be spoiled vis-a-vis the problem). I wanted to figure out a way to determine the shortest path from one corner of a cuboid to the direct opposite, using only the sides (if you prefer: in a room whose ceiling, floor and walls are rectangles, the shortest path for an ant to crawl from a corner on the floor to the opposite corner on the ceiling). My first idea was to find a suitable variable and differentiate: have the ant travel along the floor to some point x on the edge between the floor and an opposite wall, and then along the wall directly to the corner. x now determines the length of the path, so differentiate and find the zero of that function, and there you have it. Of course there are other ways to choose the path as well, so for each you have to derive the appropriate functions and do the same thing, and find the minimum from all those possibilities. The manipulations become rather hairy and fiddly… In a bar that evening I mentioned this to someone who almost immediately suggested that I could just fold the cuboid open and draw the paths as direct lines. Like this:

The blue cross is the starting point, the red crosses are points that map to the target corner, the three green paths are the possible minimal ones (except that with this diagram you can immediately see that it's never best to go across to a corner and straight up because it isn't a straight line).

It’s all right triangles, easy peasy. This was so simple it took me a while to accept that this was really correct! This is what I mean by terrible spatial thinking ability.

Anyway. To give you some idea of where I am, here’s a bunch of recent sketches.

An impression of a table with a seated figure. The blue overhanging thing is a lamp.

A quick bar sketch

Another quick bar sketch. The mess on the forehead is a failed attempt at an eye. On the chin is not a beard, but another false start.

What prolonged drinking will do:

An impression of six people at a table (from my perspective, I was sitting at the head of the table - squint hard, you'll see it eventually)

My first attempt at a transvestite - not sure what I was thinking here

For Finnish readers, a bonus: a selection of rather puzzling children’s stories with illustrations I wrote when I was 14 or so.

These books I recently bought together are curiously interconnected, to the degree that their introductions and various asides seem to be having a conversation (I’ve been popping in and out of each one, finishing none of them). The introduction to Concrete Mathematics sets the stage rather well:

It was a dark and stormy decade when Concrete Mathematics was born. Long-held values were constantly being questioned during those turbulent years; college campuses were hotbeds of controversy. The college curriculum itself was challenged, and mathematics did not escape scrutiny. John Hammersley had just written a thought-provoking article “On the enfeeblement of mathematical skills by ‘Modern Mathematics’ and by similar soft intellectual trash in schools and universities”; other worried mathematicians even asked, “Can mathematics be saved?” One of the present authors had embarked on a series of books called The Art of Computer Programming, and in writing the first volume he had found that there were mathematical tools missing from his repertoire; the mathematics he needed for a thorough, well-grounded understanding of computer programs was quite different from what he’d learned as a mathematics major in college. So he introduced a new course, teaching what he wished somebody had taught him.

The course title “Concrete Mathematics” was originally intended as an antidote to “Abstract Mathematics”, since concrete classical results were rapidly being swept out of the modern mathematical curriculum by a new wave of abstract ideas popularily called the “New Math.” Abstract mathematics is a wonderful subject, and there’s nothing wrong with it: It’s beautiful, general, and useful. But its adherents had become deluded that the rest of mathematics was inferior and no longer worthy of attention. […]

But what exactly is Concrete Mathematics? It is a blend of CONtinuous and disCRETE mathematics. More concretely, it is the controlled manipulation of mathematical formulas, using a collection of techniques for solving problems. Once you, the reader, have learned the material in this book, all you will need is a cool head, a large sheet of paper, and fairly decent handwriting in order to evaluate horrendous-looking sums, to solve complex recurrence relations, and to discover subtle patterns in data. You will be so fluent in algebraic techniques that you will often find it easier to obtain exact results than to settle for approximate answers that are valid only in a limiting sense.

Some deal! By the way, I believe the book more or less delivers on this promise, at the price of a ferocious amount of work and application on the part of the reader. There is an extremely large selection of problems, the solutions to all of which are given in an appendix (except the most difficult ones which were open questions in mathematics at the time of publication). I think anyone who worked through all of them, blood pouring from the forehead, would stand out with their sheer manipulative muscle.

The two works on probability are very much opposites in just such a hotbed of controversy: the correct mathematical formulation of the practical concepts of probability and confidence. Probability Theory is especially pugnacious, weighing in on this and numerous other matters. Its author, E. T. Jaynes, intended in it to bring together in a grand way a vision of Bayesian or inferential probability, but sadly died before the book was finished. It was edited into a publishable form by Larry Bretthorst, according to whom many sections of the manuscript concluded with “MUCH MORE COMING.” Jaynes’ death led not only to an incompleteness of the work, but also to a certain harshness in the various off-topic asides which a living author might have been persuaded to tone down. On the topic of mathematical courtesy:

Nowadays, if you introduce a variable x without repeating the incantation that it is in some set or ‘space’ X, you are accused of dealing with an undefined problem. If you differentiate a function f(x) without first having stated that it is differentiable, you are accused of lack of rigor. If you note that your function f(x) has some special property natural to the application, you are accused of lack of generality. In other words, every statement you make will receive the discourteous interpretation.

We could convert many 19th century mathematical works to 20th century standards by making a rubber stamp containing this Proclamation, with perhaps another sentence using the terms ‘sigma-algebra, Borel field, Radon-Nikodym derivative’, and stamping it on the first page.

Modern writers could shorten their works substantially, with improved readability and no decrease in content, by including such a Proclamation in the copyright message, and writing thereafter in 19th century style.

Other contrarian topics include “The Hausdorff sphere paradox and mathematical diseases”, “Counting infinite sets?”, “Bogus nondifferentiable functions” and “What is a legitimate mathematical function?” A less reverent editor would definitely have omitted these, but although they don’t really add anything to the subject matter of the book, they are a lot of fun and I don’t mind hearing Jaynes’ opinion on them. I want to quote just one more of these tangents, on the subject of probability in quantum physics:

Those who cling to a belief in the existence of ‘physical probabilities’ may react to the above arguments by pointing to quantum theory, in which physical probabilities appear to express the most fundamental laws of physics. Therefore let us explain why this is another case of circular reasoning. We need to understand that present quantum theory uses entirely different standards of logic than does the rest of science.

In biology or medicine, if we note that an effect E (for example, muscle contraction, phototropism, digestion of protein) does not occur unless a condition C (nerve impulse, light, pepsin) is present, it seems natural to infer that C is a necessary causative agent for E. Most of what is known in all fields of science has resulted from following up this kind of reasoning. But suppose that condition C does not always lead to effect E; what further inferences should a scientist draw? At this point, the reasoning formats of biology and quantum theory diverge sharply.

In the biological sciences, one takes it for granted that in addition to C there must be some other causative factor F, not yet identified. One searches for it, tracking down the assumed cause by a process of elimination of possibilities that is sometimes extremely tedious. But persistence pays off; over and over again, medically important and intellectually impressive success has been achieved, the conjectured unknown causative factor being finally identified as a definite chemical compound. […]

In quantum theory, one does not reason in this way. Consider, for example, the photo-electric effect (we shine light on a metal surface and find that electrons are ejected from it). The experimental fact is that the electrons do not appear unless light is present. So light must be a causative factor. But light does not always produce ejected electrons; even though the light from a unimode laser is present with absolutely steady amplitude, the electrons appear only at particular times that are not determined by any known parameters of the light. Why then do we not draw the obvious inference, that in addition to the light there must be a second causative factor, still unidentified, and the physicist’s job is to search for it?

In short, Probability Theory, in addition to being a strong and demanding exposition of Bayesian probability, is a font of unconventional, direct thinking. Principles of Statistics stands in contrast, being an absolutely straightforward textbook on the time-tested methods of frequentist probability. Despite this, its introduction happens to present an opinion on the philosophy of probability, dismissing efforts such as Jaynes’ work:

[The introduction first introduces frequentist probability and then various approaches to inductive probability, all stemming from the “principle of indifference” and finding problems in each one]

It has been reluctantly concluded by most statisticians that inductive probability cannot in general be measured and, therefore, cannot be used in the mathematical theory of statistics. This conclusion is not, perhaps, very surprising since there seems to be no reason why rational degrees of belief should be measurable any more than, say, degrees of beauty. Some paintings are very beautiful, some are quite beautiful and some are ugly; but it would be absurd to try to construct a numerical scale of beauty on which the Mona Lisa had a beauty-value of 0.96! Similarily some propositions are highly probable, some are quite probable and some are improbable; but it does not seem possible to construct a numerical scale of such (inductive) probabilities.

Here’s Probability Theory on the same subject:

For many years, there has been controversy over ‘frequentist’ versus ‘Bayesian’ methods of inference, in which the writer has been an outspoken partisan on the Bayesian side. […] In these old works there was a strong tendency, on both sides, to argue on the level of philosophy or ideology. We can now hold ourselves somewhat aloof from this, because, thanks to recent work, there is no longer any need to appeal to such arguments. We are now in possession of proven theorems and masses of worked-out numerical examples. As a result, the superiority of Bayesian methods is now a thoroughly demonstrated fact in a hundred different areas.

If books could fight… Of course, as far as I’m aware, statistical/probabilistic methods as taught at universities are, at least on a low level, still purely frequentist.

The combination of these very practical, mathematically demanding, hard-nosed but nevertheless somehow philosophical, personal and opinionated books is very intriguing – I hope I’m able to put enough work into them to extract for my benefit at least some of the immense work that has gone into them.

I had to bang my head against the keyboard for a long time to get through problem #76 at Project Euler. It’s a deceptively “easy” problem: In how many ways can you write 100 as the sum of at least two positive integers, disregarding different ways to order the summands? (Writing integers as a sum of positive integers is known as partitioning.)

100 is such a small number that I figured even a stupid solution would work, but my first naive method of overgenerating sums and throwing them all into a set was too slow (it was ok up to partitioning about 15, but 20 took over a minute already). I was sure early on that some sort of recursive solution would work, but I must have had a stupid day because it took me ages to hit on a suitable idea (recurse by partioning the greatest summand into two partitions every way possible until you get to a sum of ones). The forum for discussing the solution was quite interesting; many people had thought of cleverer dynamic algorithms, finding the solution in a couple of hundredths of a second. But it turns out that there’s a lot of real number theory behind this as well. There’s a good article on MathWorld. I think the “simplest” formula people used to get solutions in something like a millisecond was this recurrence relation:

which apparently comes from a generating function using a q-series, discovered by Euler. I don’t really know what that means.

I bought two books on my recent trip to New York: one about how large programming projects always fail (which I suggestively lent to my supervisor at work) and a collection of problems from mathematics olympiads in the Soviet Union. The latter included as commentary to one of the problems something a little surprising. First, the result:

Where n is a natural number greater than 1 and pn is the nthprime. In other words, the sum of the reciprocals of the primes is approximated by ln(ln(n)) (natural logarithm) absolutely well, so that the difference never exceeds a constant (which is at most 14).

(The original problem was to prove that the sum of the reciprocals of the primes becomes arbitrarily large).

The distribution of primes among the natural numbers has interested mathematicians pretty much as long as mathematics has existed. One important result along these lines is the prime number theorem which states that the number of primes below n is approximately n/ln(n) in the sense that the relative error approaches zero (or their ratio approaches 1) as n increases. The absolute error (the difference, which is what the surprising fact is about), however, is (I think) not even known to be bounded. This is not at all easy to prove, and in fact it was only in 1896 that some people using non-elementary methods succeeded (this after Gauss had taken a swing at it).

I once attended a course named after this theorem, and the general feeling I got from it was that the distribution of primes is highly intractable and it’s difficult to prove even quite general things about it. Why should it then be that we can say something so precise about the sum of the reciprocals? The fact that this result was new to me implies that there’s some obvious reason why something like this is easier and less important.

In our paper, Reyniers and I propose a second possible explanation […]
1. Assortative mating of tall men and beautiful women. […]
2. Assortative mating of intelligent men and beautiful women. […]
3. Extrinsic correlation between height and physical attractiveness (produced by Mechanism 1 above) and extrinsic correlation between intelligence and physical attractiveness (produced by Mechanism 2 above) will create a second-order extrinsic correlation between height and intelligence.

We believe that this may be why taller people are more intelligent than shorter people. Another factor contributing to the seeming male advantage in intelligence is that taller parents are more likely to have sons than shorter parents. So, over many generations, more sons will inherit their parents’ genes inclining them to be taller and more intelligent, and more daughters will inherit their parents’ genes inclining them to be shorter and less intelligent. But, once again, the crucial factor is height, not sex.

In our paper, we present evidence for all of the crucial mechanisms: Taller people are on average physically more attractive than shorter people; physically more attractive people are on average more intelligent than physically less attractive people; taller people are on average more intelligent than shorter people; and taller parents are more likely to have sons than shorter parents.

I have no idea whether this particular hypothesis will turn out to be correct, but in general I suppose there must be numerous human selection mechanisms of this kind waiting to be discovered. I expect they will explain some surprising things, confirm some unpopular but well-known truths and raise much ire. As danimal, that steely fist of Internet logic, put it:

Theodosius Dobzhansky famously wrote: “Nothing in Biology Makes Sense Except in the Light of Evolution.” This includes relationships between men and women.

It surprises me that the obvious extensions of this haven’t been researched much, at least as far as I know.

If you’re into science, you’ve probably heard people complaining about scientific information and contemporary results being too proprietary and hard to discover, especially considering they’re mostly funded by the public. The Science Commons is looking to change that; let’s hope it takes off.

For math geeks: a series of Project Euler puzzles revealed something shocking to me:
– once you’ve solved problem 64 and have a good way of handling continued fractions (it’s not feasible to do this with floating point approximations)
– gone on to 65 and learned about convergents
– you run into 66 which seems to be completely unrelated and again too slow for naive bruteforce methods, so after banging your head for a while you google quadratic diophantine equations and find out that the way to solve these equations is to find the right convergents for the square roots – and this is guaranteed to find the minimal solution for every equation of this type! Check it out.

There’s a program called Microsoft Songsmith that’s supposed to allow users to sing over a backing track that follows their singing. It doesn’t really work. Except for hilarity:

A conversation I had yesterday evening revealed something about computability and mathematics that hadn’t occurred to me. First, some preliminaries.

Consider a computer running a program written in a language with a finite number of symbols. The computer proceeds step by step through a string of instructions, some of which may tell it to execute commands from another location in the string of instructions. Some programs like this will run forever (for instance a program that simply enters an infinite loop) and others will do some finite number of things and then stop executing. Call these two groups of programs “non-halting” and “halting”, respectively. Defining these things suitably it is possible to prove that if a program is halting, its length determines an upper bound on the number of steps it takes. If the length of the program is n, call the maximum number of steps f(n).

Now consider a mathematical statement about the natural numbers, the unproved Goldbach’s conjecture for instance. The conjecture says that every even integer greater than 2 can be written as the sum of two primes. 4 = 2+2, 6 = 3+3, 8 = 3+5, 10 = 5+5, 60 = 7+53 etc. This is a claim about every even number there is, so presumably there isn’t any way to just do calculations to prove the conjecture: we have to prove it for all of the infinitely many even numbers at once.

Now consider a computer program that goes through even numbers, checking whether each can be written as the sum of two primes. This program is either halting or non-halting. If it is halting, Goldbach’s conjecture is false (because the program would only halt if it finds a counterexample). If Goldbach’s conjecture is true, the program is non-halting – but we can’t exactly wait forever to see that it never halts. But because we know that a program of length n will either halt by step f(n) or never, we only need to wait until the machine has taken f(n) steps. This means that it is sufficient to do finitely many calculations to find out whether Goldbach’s conjecture is true.

Unless I or the guy who explained this to me missed something; I haven’t actually read about this in a book or anything.

And of course this doesn’t make for a very feasible way to solve the problem because f(n) grows very quickly indeed.

The programming problems on Project Euler are almost always fun to do, but the mathematics behind the (early) ones have mostly been unsurprising. But this was something I hadn’t heard about: if you arrange the natural numbers in a counterclockwise square spiral and look at the diagonals like this

you notice that
1) the odd squares are on the lower right diagonal (easily proved and not so shocking)
2) 8 of the 13 numbers (~62%) on the diagonals in this 7×7 square are primes.

If you keep extending the spiral you find that primes end up on the diagonals surprisingly often. The programming task was to find what size the spiral has to be for the proportion of primes on the diagonals to be less than 10%. The suprisingly large answer: 26241×26241, by which point the numbers in question are obviously already quite big.

And of course if one discounted the lower right diagonal there’d be way more primes yet.