25 January, 2013

Solving a (formula based) Sudoku puzzle using LINQ

Last weekend (the 19th January 2012) we went to a geocaching event in Heidelburg here in South Africa. While at the event a new cache was published, and it consisted of a Sudoku puzzle with a twist. No numbers were given.

The puzzle looks like this:

We were about 5 people who solved the puzzle in about 30 minutes, and once back at work I showed a work collogue (lets call him Bob) the puzzle and we started chatting if it would be possible to write a LINQ query to solve this puzzle.

But first I wanted to understand how this LINQ query worked by simplifying the problem.

So I did the following:
I drew a small 3 by 3 grid and entered some numbers.

I entered "random" numbers from 1 to 9 into each cell. At this stage I am simulating one section of the Sudoku puzzle, so that when I write the LINQ query I can understand it better. It is much simpler to work with a 3x 3 grid and understand what is going on then working with a 9x9 grid.

Once I had the numbers then I made up some of my own formulas as follows:

As it happened I did enter 3 circular references. This helped me understand how the "from" and "joins" work later on. (See further below).

I started with the brute force method as per the MSDN blog, and the code looked like this:

The reason b3, a3, and a1 are left in the "from" portion is that there is a circular reference to that value.

b2 == c2 - a3
and c2 == a2 * b2
and a2 == a3 -b2(b2 depends on c2, and c2 depends on b2)(b2 depends on c2, and c2 has a2, and a2 depends on b2)
So there is no reliable way for the "join" to know what the value is therefore it must be left in the from section of the code.

The ordering of the joins.
When a join is added the values of the formula must already be known at the time (previously defined).
Since c3 == b3 - b2, the value b3 and b2 must be defined before the c3 join.

This proved to be a 4 hour re-factoring job once I got to the 9x9 grid.

Once I had this in place and the tests running, I now knew how to solve the 9x9 grid.

While I was testing the proof of concept on the 3x3 grid Bob got going with the 9x9 grid brute force (or "where clause" method).

Conclusion:
1) LINQ can be used to solve different types of puzzles.
2) Using joins greatly reduces the time spent to find the answer.
3) Formulas in the sheet which refer back to themselves cannot be added to a join, and can only be used in the "from" section. This means the LINQ query must *guess* these values. The less guesses there are the faster the answer will be calculated.

Todo:
1) The LINQ query above only solves this one single Sudoku puzzle. It would be interesting to come up with a generic solutions that would work for any Sudoku puzzle of this type.
2) It would be interesting to see if I can get to a general Sudoku LINQ solver for a normal Sudoku puzzle.
3) Better understand *how* the joins speed up the calculation.