June 8, 2007

Letting the Machine do the Thinking: Act 2 (Disaster)

The good news from program to find the answer from the self-referential test was that it found an answer (after approximately 20 hours of computation). Whether or not it was the right answer has been lost to the nether-world of computer memory.

Here is the scenario, I had a release build of the program that I had running via the start with debugging option from Visual Studio 2005. I did this so I could check in on it every so often by setting a break point and examining some variables, to ensure it was making progress or to see how far along it was. When the program finds an answer, it prints it out to the console and then exits (and here is where my troubles began). The program found the answer and then exited, but Visual Studio 2005 behavior is to also close the console when the program exits. So, it printed out the answer and then promptly erased it. Oh well, it probably wasn’t bound to be exactly the right answer the first time despite my best efforts and testing, but it would have been a starting point, but we do have move on from.

The first thing I know from checking the progress from a break point was at least to the AAAAAAAAAAAAAAAAAACA range. The second thing is that I know that I was potentially only using half of the computing power available to me.

To increase performance, most processors (even the one in my laptop) have the ability to run two execution threads on a single processor. The reason for this phenomenon has to with memory speed, processor architecture and compiler (or program) limitations. Modern processor architectures achieve their power not from executing a single instruction faster, but by being able to execute more instructions in parallel.

Luckily, our problem is ridiculously parallelizable. In that if we were given 95,367,431,640,625 processing elements we could easily use them all by asking each one to check a single answer.

So, this is the model that we are going to add to this program, the ability to specify a start answer and an end answer to the range. This allows us to coordinate the work of two different processes by giving them non-overlapping ranges to work on, allows us to use multiple processors at once, and aids in testing by allowing us to specify a single answer to check (this is useful for verifying that we fixed a bug if the program was to eager in announcing that it found the answer).

Note, there were some technical difficulties with my Internet connection which caused the delay in post this.