"Never compute pi in binary - because it goes on infinitely and is random, it theoretically contains every finite bit string. So, you will then possess all copyrighted material in existence and be liable for some serious fines."

This is obviously meant to be humorous, but it got me thinking. If every finite bit string exists in a binary representation of pi, would it be possible to use this as a method of transmitting data?

For example, let's say I wanted to transmit a bit string that could be interpreted as an jpeg image. Instead of sending the information directly, I would find its location within the digits of pi, and simply send the location of the first bit within the digits of pi, as well as the lengths of the string.

This seems pretty straightforward to me, but the obvious hurtle here is that the probability of finding this string within even the first several trillion digits is remarkably small. So, it could end up taking an immense amount of time to find.

My thinking is that several machines could be dedicated to searching for large files within pi, and then creating an index of all of their start locations. So, each computation would only need to occur once and then that information could be transmitted extremely quickly from then on.

So, what do you think? Is this at all feasible, or would these computations take far too much time?

Thanks for reading! I apologize if I have overlooked any posting guidelines, this if my first question in this forum.

Cheers!

EDIT:

Thanks for your quick responses, folks! I figured there was error in my reasoning, nice to know why!

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
If this question can be reworded to fit the rules in the help center, please edit the question.

You seem to be missing an important concept of "information entropy". Even if you had infinite digits of Pi. The address containing XXX data will be at least as long as XXX itself.
–
MysticialJul 6 '12 at 18:45

@trumpetlicks I'm saying that the compression is not possible, because the address will be as large as the data itself.
–
MysticialJul 6 '12 at 19:04

2

@MooingDuck: Yes and no. Across all possible input strings, yes. But the inputs people care about mostly contain some degree of structure and redundancy, not pure entropy -- and in that case, compression is not only possible, but often works pretty well.
–
Jerry CoffinJul 7 '12 at 1:16

However that doesn't mean I can just hack into everyone's accounts and steal their identities. Because I don't know where each person's SSN starts. And for a typical 9-digit SSN, the first digit in Pi where that SSN will appear will be on the order of 9 digits long. In other words, the information about the SSN is kept in the address rather than in Pi itself.

For example, if someone has the SSN: 938-93-3556

It starts at offset 597,507,393 in Pi. That number 597,507,393 is about as long as the SSN itself. In other words, we've gained nothing by using Pi.
(I'm not sure if there's an earlier offset where it appears, but the probability decreases exponentially with smaller offsets.)

To generalize this, even if you had infinite digits of Pi (which theoretically holds all possible information), the address that holds data XXX will (with extreme probability) be as large as XXX itself.

In other words, the information is not held in the digits of Pi itself, but rather the address where the information starts.

@RafaelBaptista: true, but that's not the point Mystical is making. What he's saying is transferring the offset/index will (on average) take more data than transferring the data itself.
–
Mooing DuckJul 6 '12 at 19:21

1

@trumpetlicks: In base 10, it takes more space on average.. In base 2, it also takes more space, by exactly the same proportion. And same with every single other base, by exactly the same proportion. In all bases, it takes more space on average to encode the offset than it does the data itself.
–
Mooing DuckJul 6 '12 at 21:26

1

@Rafael: Actually, if we assume certain randomness properties hold for pi (it is believed they do), we can guarantee that, with probability 1, every finite sequence of digits, no matter how long, will eventually appear in the digits of pi.
–
BlueRaja - Danny PflughoeftJul 7 '12 at 1:07

1

@trumpetlicks: Why are you talking about decimal digits? For the numbers in Mystics' example, you can hold the index in 30 bits, or the value itself in 30 bits. The value, not the decimal representation.
–
Mooing DuckJul 8 '12 at 16:58

1

@trumpetlicks: As I said, nobody stores the digits of PI that way, because no bignum works that way. Every bignum I've worked with uses base 4294967296 on the inside, for speed.
–
Mooing DuckJul 8 '12 at 21:44

@trumpetlicks: The base is arbitrary. All the reasons it won't work in base 10 are exactly the same reasons it won't work in base 2, except you multiply all the averages by 3.32192809.
–
Mooing DuckJul 6 '12 at 21:24

No, it's not possible to efficiently find an arbitrary sequence in a random sequence -- that follows from the definition of "random." (If there were a way to predict where the sequence occurred, it wouldn't be random.)

As for indexing all the locations, well, what have you gained? You're essentially saying "Jump to starting point 0..." and then you have to say either "...and then calculate the next JPEG-sized bits in π..." (no win, since you have to use up energy doing the calculation) or "... and then lookup the next JPEG-sized chunk of data in the mega-π index." (In which case you could just, y'know, load the JPEG file.)

You can't win and you can't break even (and, for what it's worth, you can't get out of the game).

UPDATE: @Mysticial's answer is better than mine. His point

For example, if someone has the SSN: 938-93-3556

It starts at offset 597,507,393 in Pi. That number 597,507,393 is about as long as the SSN itself. In other words, we've gained nothing by using Pi.

Funny. This seems to center about the feasibility of finding the offset. The thing is, if you'd ignore that, you'd run into the real problem: you'd lose information efficiency by representing the offset (instead of the actual data).
–
seheJul 6 '12 at 19:20

@Mysticial Yes, your answer is better than mine which was just a flip response.
–
Larry OBrienJul 6 '12 at 19:26

"it's not possible to efficiently find an arbitrary sequence in a random sequence" - while this is true, pi is not a random sequence. It's possible, for instance, to efficiently calculate the nth digit of pi without having to calculate the digits preceeding it. It's certainly feasible that some number-theorist could come up with an efficient way to locate the first position of an arbitrary bitstring in pi... but (besides being outside the scope of this site) that doesn't give us a way to compress that bitstring, for the entropy-reasons mentioned by @Mystical.
–
BlueRaja - Danny PflughoeftJul 7 '12 at 1:01

Sorry, but if we can compute 10^13 digits and, in principle, can compute any digit of Pi (given enough time/memory), these digits cannot be described as "unpredictable". However, if a "random" sequence doesn't contain certain fixed string, it can't be random (wrt. Turing machine; there are several definitions, but they aer largely equivalent).
–
jpalecekJul 6 '12 at 19:04

They are unpredictable in that you can't predict what the n'th digit of PI is without computing pi to that number.
–
Rafael BaptistaJul 6 '12 at 19:15

Actually there's a spigot algorithm for computing binary digits of pi without computing the preceding digits (the BPP formula is somewhat well known). In addition, I'm not sure how you're concluding that we can't prove pi contains every possible sequence of bits (whether it does is currently unknown).
–
NabbJul 7 '12 at 4:25

I'm not saying that I can prove it doesn't. I'm just saying that infinite non-repeating pseudo random number sequences like pi cannot be proven to have every sequence.
–
Rafael BaptistaJul 7 '12 at 18:27

because it goes on infinitely and is random, it theoretically contains every finite bit string

Pi goes on infinitely, but definitely isn't random - ie. its digits can be computed by a program of O(log n) size (and therefore, finite prefixes can be generated by programs much smaller than the prefixes), which means the Kolmogorov complexity of prefixes of Pi is asymptotically less than their size. Therefore, it has yet to be proven that it contains every finite string (I don't know that).