Sudoku Research

How many valid Sudoku solutions are there? How many puzzles can be made that are unique?

I wanted to find out. I know I can Google it but I want to understand how to calculate this number so I embarked on a research project on my own.

A ‘valid’ sudoku solution is a 9 row by 9 column board with exactly one of each digit, 1 through 9, in each row, column and 3 x 3 square.

Getting started

I used my knowledge of programming to write a program to help me compute this. However, although the program was fairly simple to write, this was not as straight-forward as I first envisioned it to be.

There are 81 cells in a Sudoku puzzle (9 rows x 9 columns). You can take the 81 cells and “flatten” them by setting them all in the same row, to work with the numbers easier. It is relatively simple to set an 81-digit number to all 1’s and then count in a program from 111….111 to 999…999 and then look at each of those numbers to see if they are valid sudoku solutions or not.

Although the program is pretty easy to write, I soon realized it is a monumental task for a program to count that high. Let me explain. There are 981 or 1.966 x 1077 possible combinations of unique numbers in 81 digits. Counting through 81 digits gives you 9 to 81st power (981) unique numbers, or

Note – the calculator I used to compute the above result only displays the first 32 digits, so all of the zeros past the first 32 digits were added to illustrate just how big this number really is. It isn’t the correct number but it is the correct number of digits.

This is getting tricky…

The problem with so many numbers is that it would probably take years, using a high-speed computer, just to count them, let alone look at each combination of numbers and check their validity. Counting through only nine cells takes about three seconds on my computer. Each time just one more cell is added the number of possibilities multiplies by nine. Therefore each time a cell is added the processing time multiplies by nine. By the time you get to 81 cells it is a HUGE number. And it therefore takes a HUGE amount of time to execute.

Hmmm...

Let’s try something else

I thought more about this and decided maybe there is another way to look at it. I could start with one cell and see how many possibilities and valid values there are. Then go to two cells to do the same thing, increasing the cell count by one each time. Maybe I can find a pattern and then come up with a formula. Below are the results.

With only one cell there is only one possibility and it is valid (‘valid’ means there is only one of each digit, there are no repeating digits).

1

With two cells there are four possibilities and two of the four are valid. The valid values are in bold.

Note how the # of unique numbers AND the # of valid solutions are rapidly getting larger as we add cells.

With four cells we have 256 unique numbers and 24 valid numbers, more than I want to put in a list. You get the idea – don’t want to write all 256 numbers here.

I ran my program and checked cell counts from one to three and verified it gave me the same result as I showed above. Then I ran it with cell counts from four to nine. Let’s look at the results from one to nine.

Number of cells

Valid numbers (n!)

Possibilities (n^n)

1

1

1

2

2

4

3

6

27

4

24

256

5

120

3,125

6

720

46,656

7

5,040

823,543

8

40,320

16,777,216

9

362,880

387,420,489

Remember that the # of ‘valid' numbers is the total count of numbers that contain only one of each digit. Possibilities is the total count of possible numbers for that many cells. For example, if you have nine cells you can have nine digits which gives you 387,420,489 total unique numbers and 362,880 numbers that only contain one of each digit.

Now we see a pattern. The # of valid numbers is n! (n factorial). The number of possibilities (unique numbers) for a given # of cells is nn. However, after nine cells we can’t use n! for # of valid numbers because with 10 or more cells there are still only nine unique digits.

Hmm… still too many possibilities to try

As I thought about this I realized there might be a way to eliminate many of the invalid numbers. We know that 111…111 (all ones), 222…222 (all twos), through 999…999 are invalid. We also know that any repeated digit in any row, column or square – like 1123456789 – invalidates that number.

One way to speed up the process is to make a list of all 362,880 valid numbers with nine cells, repeat that nine times, and go through all valid values. This is a way of eliminating invalid values that we know we don’t need to test.

Doing that would eliminate bunches of invalid numbers, but we still have lots to check. Let’s see, 362,8809 = 1.0911x1050. That’s a lot fewer than all possibilities 1.96627050x1077.

The good news is we just removed 27 digits from that huge number we started with. The bad news is we still have a number with 50 digits, still too big to compute within a practical time period.

Uncle! Uncle!

Maybe this isn’t as easy as I thought… a look at Google and I see that even the mathematicians are struggling with this. But there is an answer. Bertram Felgenhauer and Frazer Jarvis have come up with 6,670,903,752,021,072,936,960 valid sudoku solutions.
And I thought this would be easy to calculate.

I will continue my research - my next step is to list all 362,880 valid numbers and work with them as mentioned above. I'll update this when I have done that work.