Posted
by
Hemos
on Thursday August 02, 2001 @12:33AM
from the 'merica-mom-and-apple-pi dept.

freedumb writes "From this article in Nature: "Two mathematicians have now taken the first step towards proving that pi contains not a single message but every conceivable message, meaningful or not."" Actually, it's a discussion concerning whether "that all strings of the same length appear in pi with the same frequency: 87,435 appears as often as 30,752, and 451 as often as 862, a property known as normality."

"The interesting question is, must it also have all infinite strings embedded in it? I suspect that would lead to a contradiction, but this goes beyond my mathematical competence."

Sure would. Take pi, and add 1. Remove the decimal point so that you just have the infinite string 414159265358979... Suppose that's embedded in pi somewhere. The string 414159... must therefore also be embedded in itself (that is, aside from the trivial embedding, where you just look at the entire string). But wait! That means that the string 414159... must repeat, and thus, so must pi. Since pi is irrational, we have a contradiction.

(Sorry for the rather unclear explanation, it's 4 A.M. here and I can barely think straight.)

Soo.. if pi contained every possible message (ie, was truly random), couldn't you in theory find a specific position where pi prints out say, the Max Payne ISO, and distribute that position to friends?

Then, said friends, start calculating pi from that offset (wasn't there a story on slashdot about calculating any N digit of pi without having to calculate the first N-1 digits). Voila, kickass compression.

Of course, the small snags here are:

Searching pi until you find that right position that matches your Max Payne ISO, which could be located on the far end of infinity.

Distributing what could be a multi-trillion digit number to your friends.

But once you get over these boring details, pi-based-compression can make for some very neat applications

You must have missed the story where they zipped DeCSS code and came out with a long prime number. The algorithm to get DeCSS code from this number is unzip (and it's available everywhere). So if the number itself is not illegal, then neither is the decimal displacement from pi.___

> you should know one egregious example of funny strings in Pi at funny positions:

> 42424242 at position 242424.

Incredible! I have just discovered that it also lists out all the digits of pi, starting at offset zero!

Now instead of calculating the digits of pi, we can just look them up in the digits of pi!!!

On a serious note, observe that if pi does indeed have all possible strings embedded in it, then it must have all possible strings embedded in it twice. (And thrice, 4x, 5x, etc. The proof is left as an exercise to the reader.) Thus if it does embed all possible strings, it follows that the first n digits of pi must appear in it somewhere other than at offset zero, for any positive n.

The interesting question is, must it also have all infinite strings embedded in it? I suspect that would lead to a contradiction, but this goes beyond my mathematical competence.

Turing machines with attached "oracles" are used for proofs in theoretical computer science, but it's important to keep in mind that they are purely theoretical - no one will ever build an oracle. You don't think about how an "oracle" works. It's just a concept, like imagining that magic works. So why is this useful?

Well... For instance, you might be able to prove that such-and-such a problem with input size "n" could be solved in polynomial time (i.e. "fast") if you just had a magical oracle to supply you with only log(n)correct, one-bit answers.

The point of the proof would be that you don't need more than log(n) magic bits from the oracle. So what good is that? Well... If you can get the number of magic bits small enough, while still keeping the algorithm fast, it may provide a way to do a randomized algorithm where instead of using the magical oracle (which doesn't exist, remember?) you just use a random number generator, or maybe just try everything. Since a random number generator isn't as accurate as a magical oracle, you run the algorithm a lot of times with different random bits. Maybe you'll be lucky soon, and maybe that's good enough on average.

Wow, you just sparked a real idea, mathematitions say its impossible to encode a large truly random sequences of bits, into something that the outcome plus the decomressor is smaller than the first bytes. But given enough computing power you could find the large random bits in pit, somewhere and simple have the decompressor know the how to compute pi, and the start and stop points of the random number?

Why concentrate on just pi? If they show it's true for all trancendental numbers, they've got it for pi, e, etc.

I'd be happy with just pi for starters...;)

Furthermore, it is not true for all trancendental numbers: for example, 1/n^(1!)+1/n^(2!)+1/n^(3!)+1/n^(4!)+1/n^(5!)+...
are trancendental [swarthmore.edu], but with n=10 that number has only 1's and 0's, so it's not normal.

Can pi appear in pi anywhere? I guess not, since that would mean that pi repeats. Could e be in pi? I suppose if e was in pi, and pi was in e, then pi would be in pi, which I guessed earlier it couldn't be. But, maybe I'm wrong and there's a loophole since if pi contains itself, there's an infinite recursion going on.

Pi can't appear in pi, because that would make it repeat and make it rational, which it isn't (at least the way I understand "pi in pi"). Is e in pi? Not neccessarily. These guys are trying to prove that any finite number series can be found in pi, and e is infinitely long. If it's true, then you can choose any n and you can find n digits of e in pi, but not infinitely many.

Of course, e might be in pi (though I consider that unlikely - this would mean that pi=p/q+e*10^(-n) where p, q and n are integers, which seems quite weird). But what these guys are trying to prove doesn't show that.

It wouldn't work. With a completely random (normal) data set, the address of any particular string of numbers is of equal length to... the particular string of numbers! Thus, the average distance into pi of a four digit number... is a four digit number. I don't really care to do the exact math, but the end result is that the number of bits you wish to find and encode the address of would, on average, require an address with an equal number of bits.

Actually, all you need to do is FIND it. Not that this is a trivial task, but if you know the position, you can retrieve the digits with multiple ease with a simple fast algorithm (at least if the digits are binary)

However, like you said, FINDING it would take far longer than just sending a damn copy of the thing.:) If we ever had really REALLY fast computers some day, this could do wonders for data compression. Any value could be represented by a simple position.

Of course, if the position was somewhere after a googolplex digits, sending the position would be an order of magnitude more complex than just sending the data.

Why concentrate on just pi? If they show it's true for all trancendental numbers, they've got it for pi, e, etc.

Can pi appear in pi anywhere? I guess not, since that would mean that pi repeats. Could e be in pi? I suppose if e was in pi, and pi was in e, then pi would be in pi, which I guessed earlier it couldn't be. But, maybe I'm wrong and there's a loophole since if pi contains itself, there's an infinite recursion going on.

If there were an infinite number of monkeys typing on an infinite number of keyboards would they eventually produce all the works of Shakespeare? Not exactly - they would produce them immediately (as quickly as a monkey can type). That's infinity for you. The lazy 8. It goes on and on...

You've proven that if pi is normal, than for any positive integer N the first N digits of e are in pi. But that doesn't prove that e itself is in pi, because the first N digits of e are not equal to e no matter what N is.

this is the latest : microsoft sues pi for containing the complete source code to windoze.
btw, the code starts at position 4200394298 (in the binary expansion of pi), and continues for well, as long as anyone ccan read the stuff...

> Finding 88888888 (~8.9*10^7) at or before position 46663520 (~4.7*10^7) is clearly not unlikely. It should be around 37% probability

Indeed. Better pick something like 42424242 [angio.net], which not only occurs way early (at position 242424), but for which not only the search string but also the position is an interesting pattern....

Probaility of it occurring so early should be less than 1% (we would expect it below 100000000, not below 1000000), and probability of the position being a permutation of the string is...well...amazingly small.

Small note for nitpickers: I counted the 3. as digits; the search engine does not. Hence the position shown is 242422 rather than 242424.

> Big deal...since Pi is an irrational number, and never ends, at some point there is a string of 5,646,498,765 8's all in a row.

Not necessarily so. If you define a (decimal) number as follows:

it starts with 1.
after the dot, each digit at a prime number position is 1, and 0 otherwise

The number would be 1.01101010001010001...

The number would have no periodicity (because prime numbers become rarer and rarer), so it woule not be a rational number.

It never ends.

But still, by construction, it would not contain a single 8, much less a series of 5,646,498,765 eights.

Thus proving that never ending irrational number does not necessarily contain all strings. Btw, irrational numbers are always never ending, or else they would be a fraction whose denominator would be a large power of ten. Think about it.

> This is a bullshit proof. First, you define your number not to contain any 8s, and then you say "see, it doesn't contain any 8s!"

If you did any math in your youth, you'd know that this is a perfectly valid way to do a proof. It's called "coming up with a counterexample". If somebody claimed that all prime numbers were odd, it would be perfectly valid to point out that 2 is both prime and even. Discarding the proof because "you purposefully picked 2 to show me wrong" is invalid, as this is the whole point of the proof.

Likewise, in this case DNS-and-BIND claimed that all infinitely long irrational numbers have necessary a long sequence of 8's in them. I refuted his claim by showing him a number which had no 8 at all inside. Now, what exactly is your problem with my refutation of that claim?

> But that doesn't tell us anything about wether or not there is a string of 5,646,498,765 8s in pi or any other irrational number in decimal.

You're right on that, but nobody claimed the contrary. Saying "not all irrational numbers have a long string of 8's in them" is not the same thing as "no irrational numbers have a long string of 8's in them".

It's like in real life: "not all pies are cream pies" (or expressed differently "it is a pie, so it has to be a cream pie"): indeed, there are also apple pies...

With your username, you should know one egregious example of funny strings in Pi at funny positions:

42424242 at position 242424.

Oddly enough, according to the pi search page [angio.net], the same string can be found again at position 1404114, which is also below 100000000. On a normal pi, you'd expect a single occurrance of 42424242 below 100000000, and at a completely random position...

There's a signifigant difference between random and irrational. Random means you'll get a totall random value every time you look at it. Irrational means there's no pattern to it, while the number is most definately set and unchanging.

Searching pi until you find that right position that matches your Max Payne ISO, which could be located on the far end of infinity.

Distributing what could be a multi-trillion digit number to your friends.

The second problem is easy, prima facie. Just "compress" it with your compression scheme until you've minimized the energy. Establish a convention: if the position'th digit is a known pattern, then the following minimum-compliant-digits describe the even-more-distant starting place for the actual content (or another copy of the known pattern to loop again yet-further-out). If you can't find the key digits of your position before the position itself in pi, then you've got the optimum key. Of course, the drawback is this: minimizing the energy a normal-to-base-10^n-for-all-n number is not going to be all that likely. The best key Y that encodes Max Payne's starting point of X may be such that Y > X!

The first problem's not that hard, but the storage for it is a big problem. You have to keep around all of the pi digits prior to the end of your most distant dataset instance. The upside: you can store Max Payne and Linux 6.5.3 ISO and DeCSS all in one archive. The downside: poor retrieval. There are a few helpful indexing methods for searching through all those digits fast, of course. See Knuth.

The real problem is the SEARCHING for the data, which you seemed to blow off in your theorizing-remarks...;)

It's for the SEARCHING that you'd need to keep the digits, and additional indexing information besides. Yes, you can conjure the digits if someone tells you a key, but you have to conjure or search ALL the digits when you want to determine a key previously unknown.

The first application of this should be a search for the decss source code. The resulting start digit will be illegal. I have this really weird feeling that if you looked in the vicinity of the prime number that gunzips to the decss source code, you might find it there. (Possibly gzipped.)

I'm sure there's a lot of cool stuff in the first 2^128 digits. If 128 bits is a long on an itanium system, I'm sure we could have a lot of fun searching the first 2^128 digits of pi for stuff without even breaking out of the long address space. If 2^128 seems small, how about 2^256? 2^1024? 2^65536? That's not a lot of bytes, but it's a hell of a lot of space to search. Probably more than modern computers will be able to handle for years (Even if we do start up a distributed net type search engine to look for things.) Who knows. My next computer might have to include a pi coprocessor...

Why concentrate on just pi? If they show it's true for all trancendental numbers, they've got it for pi, e, etc.

My guess is as good as yours, but probably the reason they focus on pi is that pi is very old and very basic; it's one of those things that the ancient Greeks thought about. e, OTOH, is a little younger and arises from a more difficult problem.

Can pi appear in pi anywhere? I guess not, since that would mean that pi repeats.

Nope, pi can't appear directly; that is, pi can't look like 3.1415...31415...
Think about it like this: if pi contained a copy of pi starting at the n-th digit, then the (n + m)-th digit of pi would be the same as the m-th digit of pi for every m. And then pi would be the same as (first n - 1 digits) + 10^(-n) * pi. This gives pi * (1 - 10^(-n)) = first (n - 1) digits of pi, which in turn gives pi = [first (n - 1) digits] / (1 - 10^(-n)), which is a rational number.

I suppose pi could appear in pi in a slightly more complicated way though. For instance, it could be interleaved with other stuff, i.e.
pi =...3*1*4*1*5*... where the *'s are other digits.

It will start a whole new branch of numerology dedicated to finding entire new holy books... the Book of the Damned, I Microsoft, II Microsoft, the letter of BOFH to the Great Unwashed, and, of course, the source code to Office (which will take up the space between 2^8 and 2^40906)...

The article states specifically that the researchers are working in binary. The property they are looking for to prove normality is a property of a binary number. The base-10 numbers they gave were probably just examples that "normal" people would understand.

So if anything, they are proving normality to base 2^n, NOT base 10^n. And it may actually be that their proof is general enough to show normality in all bases - the article is not clear on that point.

Ok, if an average MP3 is 5MB (5,242,880 bytes) then the odds of finding a specific MP3 using a sequence of random numbers is 1/256**5,242,880 (since 8-bit bytes have 256 possibilities). This is about the same as 1/10**12,626,113 (since our base 10 numbering system gives each digit only 10 possibilities)

P(MP3)=1/10**12,626,113 (really close to 0)

Thus the odds of not finding a specific 5Mb MP3 using a sequence of random digits is 1-P(MP3)

~P(MP3)=1-1/10**12,626,113 (much closer to 1)

Since the longest known expansion of Pi is about 500 Billion digits (500,000,000,000) there are 499,987,373,888 consecutive strings of 12,626,113 digits contained in the known expansion. So the question is, what is the probabilty that at least one of these is the specific 5MB MP3 we are looking for.

An easier question is to ask what is the probabilty that we won't find the specific 5MB MP3 in a 500 Billion digit expansion of Pi.

The probability that any specific 12,626,113 digit substring of the 500 Billion digit expansion of Pi is not the 5MB MP3 we are looking for is ~P(MP3). So the probability that every one of the 499,987,373,888 possible 12,626,113 digit expansions is not the 5MB MP3 we are looking for is ~P(MP3)**499,987,373,888.

P(~MP3)=~P(MP3)**499,987,373,888

So now that we know the probability that a specific 5MB MP3 file is not contained in the 500 Billion digit known expansion of Pi, we can calculate the probability that we can find at least one instance of a specific 5MB MP3 as ~P(~MP3)=1-P(~MP3).

Let's just choose a nice little string to look for. Hmmmm, how about "deCSS". Now, running that one through my hex editor comes up with the hexadecimal digits for each letter of 6F 70 43 53 53. Converting that, in turn, to good old base 10 gives us 111 112 67 83 83. With leading zeros added and the whole lot concatenated you get the number 111,112,067,083,083. Phew!

Now, as previous posters have mentioned, to find a string of n digits you're probably going to have to look thru n digits (I think the other posters made this point clearer than I did). Oof! This means that to find the string deCSS you're probably going to have to go through something like 10^14 digits first. That's 100 trillion (if I counted correctly).

Distributing what could be a multi-trillion digit number to your friends.

Easy! All you have to do then is search pi for the multi-trillion digit number and then send it's offset. If that offset is still to long you can just do it again until you ended up with, like, a single digit!

Not being a troll, but I still don't see the big deal about one irrational number.

'Irrational' means literally 'cannot be written as a ratio'. This doesn't necessarily mean that the digits are random. You can have numbers like

3.44333444443333334444444...

that are irrational, but whose digits are trivially deterministic. Boring.

Then there are the 'dirty' irrational numbers like pi and e that seem to have random digits. The research mentioned has moved a big step closer to proving that the digits of pi don't just seem random, they truly are random (at least in the sense that all possible combinations occur).

The part that'll really blow your mind is that somebody found an equation that tells you any binary digit of pi you want, without having to calculate any of the other binary digits. (See here [doe.gov].) That is why people are excited by the conjectured normality of pi: if normal, it produces all possible strings of bits from a trivial deterministic equation. This mixture of randomness with order is at the heart of many interesting questions in chaos theory, computational theory, and cryptography.

There is a frequent fallacy among those who almost understand how compression works, that works like this:

"Wait a minute! I bet that every set of digits that someone could be trying to encode can be found somewhere in the digits of pi! Therefore, we can compress any sequence by simply reducing it to the number of digits in the sequence, and the offset in the digits of pi where an identical sequence begins!

The assumption, of course, is that the number of digits and the offset can be encoded in a form that will be smaller than the original sequence. There is nothing to warrant that assumption. The fact is that the number of possible inputs that a lossless compression method can handle places lower bounds on the average length of its outputs. This means that no lossless compression method can achieve a lower average length for its outputs than would be achieved by simply numbering them all with the non-negative integers.

In fact, 'compressing' a sequence of digits into a (length, offset) pair will do substantially worse, since there are multiple (length, offset) pairs that will correspond to a given digit sequence; for instance, "1" could be encoded as (1,0) or (1,2). This duplication means that (1,2) is essentially wasted, since it could be representing a sequence that currently has a longer representation.

Lossless compression methods need to be used in conjunction with models: some criteria that separates the data we will want to compress from the vast majority of files, about which we do not care. The accuracy of this model affects how many of our inputs we can actually compress, and its precision affects the average compression ratio.

On a normal pi, you'd expect a single occurrance of 42424242 below 100000000, and at a completely random position...

No you wouldn't. You'd expect it to occur about once every 100,000,000 trials (in this case looking at 8-digit strings somewhere in pi), which is a different thing. You'd expect some sets of 100,000,000 trials to have more than 1 occurrence, and some to have exactly 1, and some to have 0.

There's certainly nothing fishy about an 8-digit string occurring twice within the first 100,000,000 digits--if we specifically look for that string. We didn't take a string at random, but instead looked at one that was already known to occur at least once.

Hmmm... Assume temporarily that all strings of the same length appear with the same frequency in pi.

Hypothesis: e is in Pi

Base case: the first digit of e is in pi. 3.141592... 2 is in pi.

Inductive case: If the first n digits of e are in pi, then the first n+1 digits of e are in pi. We know that all strings of the same length appear with the same frequency in pi. We also know that at least one string of length n+1 digits appears in pi. Therefore at least n+1 digits of e are in pi.

Thus, having shown the inductive case, we have proven the hypothesis, that all digits of e are in pi (consecutively). Therefore e is in pi! QED.

Um, stupid comment. The Halting Problem is a problem because you can't show that every turing machine program halts (or doesn't halt). Are you really interested in the modern research in this field? Go take a look at Exploring Randomness [auckland.ac.nz]. This is a fantastic book that does a great job of, well, exploring randomness.

With a suitable oracle, a Turing machine can indeed solve the Halting problem. Indeed, let K be the set of all (Goedel numbers) of (oracle-less) Turing machines which halt when started with empty input. Then a Turing machine with a K-oracle can trivially solve the Halting Problem: it just examines its input (which will be the Goedel number of a Turing machine) and checks if it lies in K. If so, it answers "yes" and otherwise answers "no".

Do you have a constructive proof of this oracles existence? Constructive is the key word here. The problem is similar to finding the (Kolmogorov) complexity of a natural number. You can prove that every number does have a complexity (measured as the minimum size representation of that number), but you can't compute the complexities. The function exists, but you can't compute it.

Turing's theorem is much more interesting that you seem to give it credit for. Let B be a set of natural numbers. Define the "B-Halting Problem" as Given an arbitrary Turing machine, determine whether that Turing machine, when equipped with a B-oracle, halts after being started with empty input. Then Turing's proof shows that no Turing machine equipped with only a B-oracle can solve the B-Halting Problem. The usual, oracle-less, version is a special case of this: just take B = empty set.
You're absolutely right. The Turing Theorem is interesting (as long as you don't consider the original paper. ..snore!). It's unfortunate that everyone just chooses to ignore it, instead of really investigating what the implications of the theorem are (along side with Godels Incompleteness Theorem). They're mostly held up as clever examples, then discretely swept under the table.

Randomness is not an essential part of the study of Turing machines-with-oracle, and the Halting Problem does not involve randomness at all.

Not at all? The fields share many similarities, and it's hard to find a modern discussion of one without the other.

This has happened before, when mathematicians realised they had been basing proofs on a couple assumptions which themselves had never been proven. You can read about it in Simon Singh's "Fermat's Last Theorem", an extremely readable and enjoyable look into both Fermat and mathematics in general.

A teacher, a physicist and a mathemetician are having drinks together in a Scottish pub when the teacher looks out the window and sees a white sheep. The teacher says, "There are white sheep in Scotland". The physicist looks out the window and declares, "There are sheep in Scotland; we have already detected and confirmed white ones." The mathematician says, "In Scotland there is at least one sheep, at least one side of which is hite."

No, I didn't pull that from Singh (it's there, though). It's an old mathematician's joke but it's true. The most anal Rainman you've ever seen is incredibly chaotic compared to mathematicians (at least when they're working on a proof or theorem).

woof.

"No ma. You don't have to worry about Code Red. Yes, I know CNN told you that you do. Ma, do you run a Web server? No, Netscape is a browser, not a server. Yes, there's a difference. You don't want to know. No, and the Internet didn't die last week, either..." -- my side of a phone call two nights ago.

A teacher, a physicist and a mathemetician are having drinks together in a Scottish pub when the teacher looks out the window and sees a white sheep. The teacher says, "There are white sheep in Scotland". The physicist looks out the window and declares, "There are sheep in Scotland; we have already detected and confirmed white ones." The mathematician says, "In Scotland there is at least one sheep, at least one side of which is hite."

Give everyone that c program, and then, in lieu of downloading a program, zip file, mpeg file, or whatever, you just tell them to calculate m digits of pi starting at position n, and save the result as the appropriate filename. Internet traffic could be slashed!

Ok. The standard solution for solving e^(i*A*X) was useful for solving second-order differential equations (ordinary form: P(x)y" + Q(x)y' + R(x)y = G(x) ) . My notes are sparse and my brain is tired, so I'd rather not try to remember the whole series of steps required to get to the part where this is actually useful. If you want the long version of the proof, ok, I'll post it tomorrow if you say so.

I'm not even sure if this was derived in class or just one of those few equations that were given to us "just trust me" sorta things. I had the bad habit of rarely writing down proofs, so I'd probably have to hunt this one down online or in my of my Calc books.

But, magically, here in my notes it says:

e^(i x B[beta] * x) = cos (Bx) + isin Bx .

My apologies for not desiring to hunt down the appropriate symbols.

So, with B = Pi and x = 1, you get:

e^(i*Pi) = cos Pi + isin Pi .

The cosine of Pi is -1 and the sine of Pi is 0, so it becomes = -1 + (i)*0 = -1 . Notice that e^(-i*Pi) also gives -1.

So in summary it really wasn't much of a feat for me to reproduce the final parts of the proof, only a matter of remembering what that standard equation was (shortcuts are wonderful things). Let me know if you really want me to derive the top part though.

Thanks. I always knew that not writing down the derivations in class would catch me at some point =).

Well, in one case they already did. In my Calc III class my professor proved a relatively simple theorem and promptly put that theorem on the first page of the test. I wrote the theorem down, but I never really committed it to memory. That was one of the few tests I got a B on...oh well.

Trying to imagine why every n digit number shows up the exact same amount of times is hard to imagine at first. But then, once you think about it, on an infinite scale, it would seem to attest to Pi's true randomness.

On a side note, I had a Calc II professor awhile back that wrote on the board:

e^(i*Pi) = -1 (of course, using the real symbols).

Then, he proved it. I have the proof written down in a notebook and I even managed to work through the final parts of the proof (it uses a standard solution for finding e^(i*A*X) without using it. If anyone is really interested in seeing it, I can post it (in rough ascii math =) For those of you with TI-92s that don't believe me, type it in. That magical machine can do more than I give it credit for sometimes.

Anyway, I just thought it was absolutely incredible that you could mix the two most popular transcendental numbers with the imaginary number (square root of -1) and spit out plain old -1.

Before his untimely death, Alan Turing did a lot of work on a new theoretical machine capable of transcending the limits of conventional Turing machines. The new machine (the so-called O-machine), would be a Turing machine connected to an "oracle", which would store some irrational quantity that would be able to do things like solve the Halting Problem, since it would contain an infinite amount of information (including overy possible program that could be created). At least, that's as much as I can remember (no links, sorry).

Did you not understand the randomness in question? That's exactly what will not happen, ever, since that's the essense in randomness.

But if you can't calculate *all* of Pi's digits, how do you know that it isn't a repeating pattern? Until you know all of the digits, it's still logically possible for there to be a sequence. If you knew all the digits, then there *would* be a pattern, but not a repeating one. I'm definitely not a mathematician (as you can tell from my original post) so I would welcome a mathematical explanation of how one can say irrefutably that Pi is non-repeating.

It's like saying that by throwin the dice more and more maybe you'll find a pattern.

This is just restating what you have already said. I could still ask of a series of dice rolls "We might find a pattern if we throw the dice enough times". Not that this is a practical possibility; I just think it is a logical possibility.

The thing with randomness is that these "patterns" cannot compress the sequence. (worth noting is that the sequence of Pi is anything but random in a information science sense, since it's well known and can be compressed to, say, "Pi")

I don't really understand the difference between random in an information science sense and random in the other sense you were talking about. Do you mean that anything I can refer to using a symbol is not random? I suppose if you consider "referring" as a kind of mapping from one well-defined object to another well-defined object, then I suppose that objects that are well-defined cannot also be random. But this doesn't make sense to me either, because Pi is well-defined in the sense of being a ratio of parts of a mathematically defined object. But referring to the ratio is not the same as referring to the decimal expansion of the ratio. This of course means that my (light-hearted) comparison couldn't be taken as much more than a figure of speech. That's really all that it was meant to be.

Circle has hardly anything to do with this "chaotic quality", since most of the real numbers also have this quality.

But if the circle has that quality, the fact that other real numbers have that quality doesn't take that quality away from the circle. The circle may not have anything to do with the definition or essence of randomness, but that doesn't mean the contrast is any less interesting. Who says randomness, or any other mathematical construct, has an essence at all? Maybe mathematics is just a very complicated, hard-wired social behaviour, or a set of made-up rules for moving game pieces around. What do *you* mean by essence?

The expectation value of the position of any certain string of numbers is actually as long as that certain string. It might be easier to wait for some hwrandom() to produce Romeo and Juliet.

I gather that by expectation value you mean something like "the earliest position at which a string with any N digits can be expected to be found". So I agree totally. Where can I expect to find the first occurrence of *any* one digit? Well, at the first digit, I guess. And so on and so forth until you have Romeo and Juliet. That doesn't seem too earth-shattering to me, but I'm sure I don't understand all the consequences. Like I said before, I'm really not arguing at all, I was just attempting some light-hearted word play. But regardless of that, what you said hasn't convinced me that hwrandom() *cannot* produce Romeo and Juliet, given enough monkeys.

I suppose if you consider "referring" as a kind of mapping from one well-defined object to another well-defined object, then I suppose that objects that are well-defined cannot also be random.

Actually, after I read this I realized that it is nonsense. How can a symbol that refers to something be "well-defined" independently of the thing to which it refers? No way. In fact, "referring" is probably part of what *makes* a symbol "well-defined". Being well-defined is not a quality that symbols possess intrinsically.

If PI has the property they are theorizing, to find a number n digits long in PI, you'll likely look through about 10^n digits of PI. So, storing its location in PI should take about as many digits as the message you are trying to compress.

Can pi appear in pi anywhere? I guess not, since that would mean that pi repeats.

But it does appear. Start at postion 0 - you get 3.14.... for as far as you want to calculate it.

Like others, I doubt e is in pi. That would mean pi = 314...*10^-n +e*10^-(n+1), or something like that.

Here's a possibility: Assume some function that gets the xth digit of a number, such as pi(1) = 3, pi(2) = 1, etc. Find some n where pi(n) is the start of a sequence that matches the digits in e, or pi(n+a) = e(a), for at least a=1. It may be fairly easy to prove that there is always some A where pi(n+A) != e(A).

For instance, if there is an N where there is no A, then pi(N+a)=e(a) for all a. But this would mean that pi = B + e*10^(-N), where B is a finite number, and, substituting again, pi = B + pi(N+a)*10^(-N), so pi repeats, which it doesn't. I may be off by a power of 10 or so, but I think it may be possible to state that if e is in pi, then pi repeats, and since pi doesn't repeat, then e is not in pi.

Then again, IANAM (mathematicain), I'm just an EE pretending to be a programmer.

Inductive case: If the first n digits of e are in pi, then the first n+1 digits of e are in pi. We know that all strings of the same length appear with the same frequency in pi. We also know that at least one string of length n+1 digits appears in pi. Therefore at least n+1 digits of e are in pi.

Thus, having shown the inductive case, we have proven the hypothesis, that all digits of e are in pi (consecutively). Therefore e is in pi! QED.

OK, so e is in pi, and by a similar arguement, I can say pi is in e. (I'll leave that up to the student or awk, whichever can do the regexp substituion first).

So, pi becomes some rational number P plus e times some power of 10, such as:

So what's more likely is that inductive proofs are simply not able to show what I have attempted to show. I kind of suspected this, and the post was more of a troll of sorts (similar to the proofs that 1=0 and such).

I agree, there must be some limit to induction when encountering infinite numbers, but I don't remember them mentioning it in my few math classes. Is this an inductive proof?

Hypopthesis: There is no integer P where pi(P+n)=e(n) for all a, where pi(x) is a function that returns the xth digit of pi, and e(x) does likewise.

Assumption: pi has the property of having all sequences of numbers somewhere in there. For instance, for a five-digit sequence S(x), there is a first occurance in Pi, at S0, so that pi(S0)=S(1), pi(S0+1)=S(2), pi(S0+2)=S(3), pi(S0+3)=S(4), and pi(S0+4)=S(5). In other words, I'm naming the integer where you can find the first number in the sequence.
Proof(?):

There is some P1 where pi(P1)=e(1), but pi(P1+1)!=e(2). This is easy to see - you can quickly find the first occurance of 2 that is not followed by 7.

There is some P2 where pi(P2)=e(1) and pi(P2+1)=e(2), but pi(P2+2)!=e(3). Again, there is a first occurance of 27 not followed by 1

If there is a P(N), then there is a P(N+1), since all arbitratry sequences are in there. In other words, if there is a sequence 2,7,1, Not 8, then there is also a sequence 2,7,1,8, not 2.

Thus, for any N, there is a PN where pi(N)=e(1), and pi(N+1)=e(N+1), all the way until pi(N+N)=e(N+N+1), but pi(N+N+1)!=e(N+N+2)

So, no matter how many digits you look at, there will always be some N where the two don't line up.

Of course, my inductive arguements are pretty weak, even weaker than my choice of symbols...

Well, as others have rightly noted, this solution wouldn't work. It takes N digits to represent a number of N digits, quantum mathematics aside, as long as those digits are more or less random. Compression programs like ZIP only work at all because certain strings of numbers are more common than others in computer files (if I understand the technology correctly).

However, this idea could go the way of all complex (yet failed) compression algorythms: encryption! Imagine trying to decode the resulting index, with no idea that Pi was even involved. Not gonna happen.

I can't say the idea didn't intrigue me for a few seconds, though; adding infinity to any equation always makes for the most fascinating possibilities.---

I remember watching Northern Exposure when I was about 13 and there was this episode where Chris Stevens dates this mathematician chic and she talks about a string of eight 8's [angio.net]. Years later when I read about a Pi search engine I tried it and was actually surprised to see it worked.

what happens when a rule of mathematics is challenged? For example, some definitions seem so arbitrary to me. 1 is not considered a prime number because it has only itself as a factor...

Every mathematical "definition" is arbitrary. The logic in mathematics lies in the derivations from the definitions, not in the definition themselves.
You may have useless definitions (which are in contrast with "usual" maths), fruitful definitions (which let people find a lot of new things), or boring definitions, which do not give a lot of interesting results.

In the case of factorization, 1 is called "a unity", that is something separated from the prime numbers, because the most important result is the unique factorization ( up to unities!!!) in the group of integer numbers with the usual "multiplication" operation.

Notice that there are also a lot of theorems
which have to distinguish the case of 2 and of another prime: the reason is not that 2 is the only even prime number, but rather than in the field generated from the first 2 numbers, "1" and "-1" are the same thing. But those results are not so important to deserve a modification of
the definition of prime numbers to cast out 2.

A last comment: it is not difficult to write down a number which is normal in base 10. It suffices to digit 0.1234567891011121314151617181920... But if someone ever shows that \pi (or e, or a "common" irrational number) is normal in a certain base, it would be a really important result - for mathematicians, that is. The practical importance would be zero.

If a five-hundred digit sequence of numbers can be represented instead by, say, a twenty-five digit offset value, that seems like quite good compression -- until you remember that the only way it would work is if either: the (de)compressor carries around a datafile of "all" (well, a huge chunk of) for reference, or it has to recalculate up to the position of that offset on each run. And there's where it all breaks down.

0123456789 : from 17,387,594,880-th of pi
0123456789 : from 26,852,899,245-th of pi
0123456789 : from 30,243,957,439-th of pi
0123456789 : from 34,549,153,953-th of pi
0123456789 : from 41,952,536,161-th of pi
0123456789 : from 43,289,964,000-th of pi

Here's what I found [google.com]. The actual wording of the bill (Section I, House Bill No. 246, 1897):

Be it enacted by the General Assembly of the State of Indiana: It has been found that a circular area is to the square on a line equal to the quadrant of the circumference, as the area of an equilateral rectangle is to the square of one side.

Lets say I want to find the number 100. By your theory, I'd have to look up 10^100 digits of pie. Saying my query is found halway through it, I find a position of 1/2 * 10^100 digits, right, or 5e99. Now lets say I had a number simular to 5e99. Couldn't I "condense" it by transmitting the info as "first place 100 shows up in pi?" Of course, this means the decompressor needs to be pretty quick at calculating digits of pi. Still, it sounds pretty interesting.:)

Of course, to make it more reasonable, you'd just break it down into X "digit" chunks, which could be found closer to the beginning of the pie numerical string, etc.

Are you sure you know what you're looking for/at? Is the number 42424242 one that you would search for first, or a pattern one found after the fact? These are VERY different things. If it was a pattern that was interesting, one must look at the space of all interesting patterns and look at the probability that one might be found. This is the difference between math and numerology, or astrology and astronomy. Coincidence has a role, why it should even be expected; for christs sake it is even predictable. Perhaps some of the people reading this are aware of the fact that if you get about thirty people or so, together two of them are likely to share a birthday. This does not mean that if you pick any person that someone else is likely to share their birthday. Rather of the group there is a good chance two of the individuals, whomever they may be, will likely share the birthday. If you pick 42424242 before hand, that makes it unlikely, if you don't it makes it coincidence.

Wouldn't work, as others have pointed out the offset into pi will generally be as long (or usually very much longer) than the message you are trying to encode. The Max Payne ISO is a lot of bits. Given a random bit generator (pi or otherwise) imagine how likely it is that you'll find those bits in exactly the right order.

The offset, if ever found, would be huge. Information theory says it would have to be...Think of it backwards. You have thousands of ISOs all about the same size (650 megs or so, lets ignore any compression). These ISOs all have about the same number of bits but they have vastly different data. Now how could a predictable random bit source include the data for all these ISOs without being many times the size of any single ISO (meaning that most offsets will be bigger than any single ISO)? It just doesn't add up.

If pi has all conceivable messages, pi must contain all of the US military's secrets, DeCSS, kiddie pr0n, violent and explicit sexual films beyond anyone's imagination and much much more. It must therefore be banned. When you get the death penalty for circle possession, don't say I didn't warn you...

Strictly speaking, the property mentioned isn't actually normality, but normality to base 10^n for all n. Normality to base b means that if you write down the base-b expression for the number then every base-b digit occurs with equal frequency. So normality to base 10 means that in the usual decimal expansion, 3 and 7 occur with equal frequency, for instance. Normality to base 100 means that, e.g., in the decimal expansion 34 and 87 occur with equal frequency.

It's known that in a certain precise sense, almost all numbers are normal (i.e. normal to *all* bases). But to this day, not one single specific number has been *proved* to be normal!

A specific string of k bits long will on average occur every 2^k bits of a random bitstream.
(For example, it will on average take about
1024 coin tosses to get 10 heads in a row.)[0]
The reasoning I have in mind for this relies
on the fact that coin tosses are independent
events. The digits of pi are certainly not independent random events because we can calculate
them. However, because of normality behaves macroscopically much like independent events,
k precise bits will occur on average
every 2^k bits of pi. I imagine the actual
proof is highly nontrivial, but handwavy entropy
arguments convince me that this is true.

[0] As an interesting sidenote, many of the "streaks" in sports coincide roughly with the
streaks that probability dictates. Baseball hitting streaks aren't necessarily because the
hitter is in the zone... they may be because
probabalistically, they are due to hit in 30
games in a row if their average is.320 and they
play 1000 games (made-up numbers).

I recall a method for detecting falsified data that relied on the fact that certain functions (those that followed a log pattern, I think) produce results with the last digit's frequency skewed toward smaller numbers (1, 2, 3,...) and AWAY from larger numbers (9, 8, 7,...).

So my question is, does Normalcy apply to functions (in the way I described), as well as numbers?

I think he said "almost every number..." or something very like. There are, of course, some exceptions (such as the integers, and rational fractions, etc.) but they make up a very small subset of "the numbers," amost all of which are too lengthy to write in this margin.

If pi has all conceivable messages, pi must contain all of the US military's secrets, DeCSS, kiddie pr0n, violent and explicit sexual films beyond anyone's imagination and much much more. It must therefore be banned.

It must also contain all finite length MP3s. Therefore under the DMCA it already is banned.

The sad part is, I'm not joking. The DMCA is so absurdly broad that you could easily raise a cogent case for using it to ban the concept of Pi for this very reason.