How much information is there in this article? It's hard to
tell. Its length clearly isn't an indicator because that depends on my
choice of words. One thing you could do to measure information is to
condense the article into bullet
points. If
you end up with many of them, then that's
because the article is pretty dense. If
there's only one, then the information it conveys is rather basic, or at least not very complex.

Measuring complexity

Andrey Kolmogorov (1903 - 1987).

In the 1960s mathematicians developed a way of measuring the complexity of a piece of information that
reflects this intuition. They were looking at information from a
computer's point of view. To a machine, a piece of information is just a string
of symbols: letters, numbers, or, at the most basic level, 0s and
1s. A computer can't make bullet points, but it can be programmed to
output a given string of symbols. If the information in the string can
be condensed quite a lot, then there should be a short program that does this. If it
can't, then the program will be longer.

To see how this works, imagine the string consists of a hundred zeros. A
program that outputs that string only needs to say something like
"print 0; repeat 100 times". Different programming languages will use different commands to do this, but the general idea will be the same. So that's a short program for a pretty
simple string. If, on the other hand, your string contains 0s and 1s without any pattern at all, then there is no
structure to exploit. Your
program would have to say "print:" and then quote the entire string,
making the program slightly longer than the string itself.

The length of
the shortest computer program that produces a given string and then
halts (we don't want infinite loops) is called the Kolmogorov
complexity of the string, after Andrey Kolmogorov, one of three people who came up with the idea (the other two were Ray Solomonoff and Gregory Chaitin). If you interpret complexity
as lack of structure, then the name makes sense.

"Kolmogorov complexity measures the maximum compressibility of a
string via any computable process whatsoever," says Scott Aaronson, a computer scientist at the Massachusetts Institute of Technology. "Compressibility is something we're all familiar with. There are
all sorts of data compression programs, which play a crucial role on the
internet, for example for streaming videos." The programs
exploit the fact that lots of the information we deal with has some
sort of structure, like the string of 100 zeroes above. The English language
obviously contains structure, and so do images and pieces of music
(which to a computer are also strings of 0s and 1s). Just as with
the string of 0s above, you can exploit this structure to compress a file.

Computing complexity...or not

So, what's the Kolmogorov complexity of your favourite song or
your favourite movie?
Before working it out, you
need to fix which programming language you use, since a program could
be shorter in one language than in another. This is a bit like fixing
units of measurement such as inches or centimetres. It's important to
know which units you're using, but once you've agreed on them, you can
stop referring to them all the time. Which is why we won't specify any
particular language here. The only restriction is that you're not allowed to cheat, for example by saving your favourite string under the name "s" and then writing a program that simply says "print s" — we require that the program contains all the instructions on how to compress and decompress information.

Next, you need to find the shortest program that outputs your song
or movie, and that's a problem. "Kolmogorov
completely left open the question of how you actually find the shortest
program to output a given string," says Aaronson. That's bad news, but
things get worse. "You can actually prove
that this is an uncomputable problem." Informally, this means
that there is no general method for computing the Kolmogorov
complexity of any given string. You might be able to compute it for
some strings, but there's no general recipe that works for all.

Bertrand Russell (1872 - 1970).

There's a very elegant proof of this fact, based on a logical
conundrum called Berry's paradox. First notice that I could
use ordinary English words to specify a number. For example, the
sentence "The number of fingers on a generic person's hand" specifies
the number five. Clearly not every number can be specified using a
short sentence, say of at most 20 words. I, for one, can't think of a brief way to specify the number 2,354,899,123,657. Amongst the numbers that can't be specified by a sentence of at most 20 words
be, there must be a smallest one, which I'll call n.

But hang on — I've
just specified n using the sentence

Sentence A: "n is the smallest whole number that cannot
be specified
using at most 20 words",

which itself has less than 20
words. Therefore n can and can't be specified using at most
20 words — it's a paradox. Bertrand Russell was the first
to write about this puzzle, but he had learnt it from a junior librarian in Oxford by the name of G. G. Berry.

The way out of the conundrum lies in the ambiguity of
language. Sentence A doesn't give
you any numerical value, it merely gives a name, n, to a number you
believe exists. If giving a name to a number you believe exist counts as specifying
it, then you can specify any number at all,
so there isn't a smallest
one you can't specify. If you require that specifying a number
involves something more concrete, then sentence A doesn't actually specify
n. In either case, the paradox unravels.

There is no such ambiguity in computer languages though, which is
why an analogue of Berry's paradox proves that Kolmogorov complexity is
uncomputable. "Let's suppose there were a [general method for] computing the Kolmogorov complexity of any given string," explains
Aaronson. "Then it turns out that you could use this method to design a very
small computer program that finds a string of
enormous Kolmogorov complexity, much bigger than the length of the
program itself." That's a contradiction. The Kolmogorov
complexity of the output string is the length of the shortest program that
produces it. Since our program does produce the string, the string's
Kolmogorov complexity can't possibly exceed the program's
length: it's got to be the same, or smaller. We've encountered a
contradiction, and this time there's no way out by appealing to
ambiguity. We have no choice but to conclude that our original
assumption, that Kolmogorov complexity is
computable, was false.

Uncomputable but useful

What's the use of a measure you can't actually
compute? "Kolmogorov complexity is useful because it tells you the
ideal situation you would like to approach with the imperfect tools
you have in practice," says Aaronson. "In situations where the maths
calls for Kolmogorov complexity, you can instead feed your data into a
compression program, say using gzip or saving your image as a .gif
file, and use that as a crude approximation of
Kolmogorov complexity. Kolmogorov complexity would be the theoretical
ideal compression. What you're getting with standard compression
methods are various non-ideal compressions, which use various
regularities in your data, but not all regularities."

How to compare two strings representing DNA?

The theoretical concept has inspired some very practical
applications. Imagine, for example, you want to compare two DNA
strings, or two music or image files, to see if they are correlated in
some way. One thing to do is to take the
strings x and y representing the two files
and to stick one to the end of the other, creating one very long
string xy. If x and y are exactly the same,
so they are maximally correlated, then the Kolmogorov complexity of
xy will be very close to that of the individual Kolmogorov complexity
of x and y. That's because a computer program that
outputs xy would only need to include the program that outputs
x and print the output twice.

If, on the other hand,
x and y aren't correlated at all, then the shortest
computer program to produce xy would have to consist of the
shortest program that outputs x and the shortest program that
outputs y. So in that case the Kolmogorov complexity of
xy equals the Kolmogorov complexity of x plus
the Kolmogorov complexity of y.

In practice you can't compute the Kolmogorov complexity, but you
can use compression programs to approximate it. If the approximated
complexity of xy is close to the sum of the approximated
individual complexities of x and y, then that's evidence they are not
very correlated. If it's close to the complexity of an individual
string, it's evidence that they are.

"There are all sorts of reasons why you might want to know the
general relatedness of two different products, and this is often a
useful technique," says Aaronson. "These
are not direct applications because you cannot compute Kolmogorov
complexity. But the maths gives you the ideal, which then gives you
inspiration for what you can do with the tools you have."

But how does Kolmogorov complexity hold up as a measure of
information content? The complexity of a string is low if
it contains a lot of regularities that can be exploited for
compression. If it doesn't contain regularities, in other words if it is
random, then the Kolmogorov
complexity is high. But can random strings really be thought of as
carrying a lot of information? Aren't they in some sense meaningless?
This is what we'll look at in the next article.