How did the computer do that?

Quick answer:

Compile using gcc -o spellophone *.c on unix/linux. A competent C
programmaer can adapt it to run on Windows in about 5 minutes.

Long answer:

Computers are wonderful things. We take for granted that they can do stuff
like the above in microseconds, but to non-computer scientists it seems
like magic. If you've ever wondered what computer scientists do all day,
this will help.

I developed the algorithm for spellophone over Christmas break 2001.
It wasn't easy -- I went through three different drafts until I found
the right one. Would you believe that the first one took over ten minutes
to run on a ten-digit number on my old Pentium 166?

The first algorithm I tried is known as the "brute force" approach. I
had the computer go through every possible combination of letters and
check it for words. I thought I had a clever way of doing that because
I used a trie to store all of the words in the dictionary.
A trie is like a very smart parrot. You can shout letters at it, and it
will squawk if anything you say forms a word. For example, if you say
"D-O-G" at it, it will squawk because DOG is a word. If you then say
"M-A" it will squawk again because DOGMA is also a word.

Despite the clever use of the trie, the program still ran too slowly.
Can you guess why? Think about how many possible letters are in a
telephone number.

If you look at a phone keypad, you'll see that each digit has three letters
on it. Some of them have four on newer phones. So you can only make
three one-letter words with one digit. But if you add a second digit, say
"22", you can spell "AA, AB, AC, BA, BB, BC, CA, CB, CC", or 9 words.
That's quite a bit more than one! It turns out that if you add 10 digits,
there are 1049760 possible combinations of letters. And for each possible
combination that spells actual words, there are lots of different ways
those words can be placed together between dashes. So it turns out that
the computer might have to go through millions of combinations, trying
to pick out the ones that make sense.
Unfortunately, even today's computers can't process all of those words
fast enough. So I had to find a better way.

Dynamic Programming

I looked at what the computer was doing, and it seemed to me that it
was wasting a lot of its time asking the trie about the same words
over and over again. For example, if the last three digits in phone number
didn't actually spell anything, it would still check them thousands of
times in combination with the first part of the phone number.

Then something struck me. Last year, I had learned of something that
was meant to deal with just this type of problem in one of my
computer science courses. It's called "Dynamic Programming", or DP for
short.
Dynamic Programming is a way of programming that solves problems
a little bit at a time. It's best for problems where each piece of the
the problem is built on the previous one, so once you solve all of these
little pieces you can put them together and solve the entire puzzle.
If I could figure out a way to make DP work, then the computer would
only need to check each word once, and the program might work in seconds
instead of minutes!

I racked my brain, trying to remember what Professor Chan had said in his
thick accent. With DP, you have to make a grid of squares, and each
square represents a small part of the solution. After a lot of thinking,
I drew a grid on paper. Across the top, I wrote Starting Position,
and along the side, I wrote Length of word. Each square would
contain all of the words that you could spell if you started on a certain
digit and used a certain number of letters.

My hands trembled as I stepped through the algorithm on paper. I used the
phone number "78225" because I knew it spelled "QUACK." (I used to work
for a company called Quack.com and that was part of their number). Here's
what I came up with. The partial words are in normal printing, and the
finished words in each square are in bold:

Starting digit

Length

7 (PQRS)

8 (TUV)

2 (ABC)

2 (ABC)

5 (JKL)

P, Q, R, S

T, U, V

A, B, C

A, B, C

J, K, L

PU, QU, RU, ST, SU

TA, UB, VA

AA, AB, AC, BA, CA

AL, BL, CL

.

PUB, PUC, QUA, RUB, STA, SUC

TAB, TAC, VAC

ACK, BAL, CAJ, CAK, CAL

.

.

PUBA, QUAC, RUBA, RUBB, STAB, STAC, SUCC

TACK

.

.

.

QUACK, STABL, STACK

.

.

.

.

What is good about this is the table could be calculated very quickly.
Each square builds on the data that was already processed in the square
above.
I had done it in five minutes on paper -- the computer could do it
in the blink of an eye. Also, it is pretty easy to string all of the
words together so that the longest ones appear first.

Dynamic Programming always involves two steps -- first creating the
solution table and then analyzing it to piece the solution together.
The way I chose to piece the solution together is pretty simple:

Begin at the leftmost column that you haven't worked on yet.

Get a word from the bottom-most square.

Now put that word in your phone number, and go to the digits
that are now left-over at the end. Start at step 1.

Once you run out of words, try the next one from the square you
chose in step 2 before you went off to the left-over digits. If there
are no more words left in the square, continue up-wards. Repeat
step 3.

Once you have exhausted the left-most column, get rid of it,
add the number instead of a letter, and start over until there
are no more columns left in the table.

It's harder to explain than to do. Using the above data, you get
the following words in this order:
QUACK, STACK, STAB-5, PUBA-5, PUB-25, 7-TACK, 78-AB-5

Other work

There are quite a few web sites that do this kind of thing. Phonespell.org has a good description of how their algorithm works, in the F.A.Q. section.

dialabc.com is the nicest web site that I have seen to search for words in phone numbers.

i have one doubt .how to check the particular digits of the phone number using java script.for example number starts only 9367xxxxxx otherwise it returns false statement .how did i set the condition plz reply

In 2003 I used to have my phone number on my web page. "Only women are afraid to have their phone number on their web page!" I thought. How naive I was.

Then, out of the blue, I got a phone call frome someone in, I don't know, Alamaba? This guy woke me up at like, 10 in the morning (at this time I was a jobless bum and slept all day). He wanted to know how to adapt it to run on Windows.

I would have been happy to tell him, except that I only had a cell phone, and in Canada I was being charged for long distance incoming calls. So I just gave him some vague answers about makefiles and /usr/dict/words and quickly got this guy off the line.

So the motto is, don't post your phone number on your web page: otherwise, weird people call you and they might wake you up.

Hey! Thanks for the program. Yers ago, we had one hat someone from MIT wrote as a hack that did a similar thing to yours, and also went the other way - you told it things you wanted to spell with your phone nymber, and it told you what the number sequence was (easy enough to do in your head, but tedious) and then based on it's interpretation of what you were trying to express, it would offer several close alternatives, so you had higher hopes of finding a specific number that was actually available. Do you know of any such beast? (Ut was an Athena-unix op sys app asI remember), Thanks much!

I just spent several hours debugging clipboard copy of a DIB image. I could copy from my application, and paste into Paint. I could paste into Word. But if I pasted into WordPad, nothing showed up. If I pasted into GIMP, it crashed.

Back in 2007, I created a rhyming engine based on the public domain Moby pronouncing dictionary. It simply reads the dictionary and looks for rhyming words by comparing the suffix of the words' pronunciations. Since that time, I have made some improvements.