Lisp as an Alternative to Java

In the October 1999 Communications of the ACMLutz Prechelt
had an interesting article entitled Comparing
Java vs. C/C++ Efficiency Issues to Interpersonal Issues which
asked 38 programmers to implement versions of a program in C, C++, or
Java. The conclusions showed that Java was 3 or 4 times slower than C
or C++, but that the variance between programmers was larger than the
variance between languages, suggesting that one might want to spend
more time on training programmers rather than arguing over language
choice. (Or, suggesting that you should hire the good programmers and
avoid the bad ones.) The variance for Java was lower than for C or
C++. (Cynics could say that Java forces you to write uniformly slow
programs.) I applaud this line of work, and hope that more studies of
this kind will be done.

It turns out my hopes were answered. First, Prechelt published another
article that covers Tcl, Python, Perl, and Rexx. Also,
Ron Garret (nee Erann Gat) did
a
follow-up study in which he asked programmers to write Prechelt's
test
program in Lisp. His results show that the resulting Lisp
programs ran faster on average than C, C++ or Java programs (although
the fastest Lisp program was not as fast as the fastest C program),
and that Lisp programs took less development time than the other
languages.

I did not participate in the study, but after I saw it, I wrote my
version in Lisp. It took me about 2 hours (compared to a range of 2
to 8.5 hours for the other Lisp programmers in the study, 3 to 25 for
C/C++ and 4 to 63 for Java) and I ended up with 45 non-comment
non-blank lines (compared with a range of 51 to 182 for Lisp, and 107
to 614 for the other languages). (That means that some Java programmer
was spending 13 lines and 84 minutes to provide the functionality of
each line of my Lisp program.)

Here is my program:

;; Peter Norvig - Programming Challange from Erann Gat:
;; http://www.flownet.com/ron/papers/lisp-java/
;; Given a list of words and a list of phone numbers, find all the ways that
;; each phone number can be expressed as a list of words.
;; Run: (main "word-list-file-name" "phone-number-file-name")
(defvar *dict* nil
"A hash table mapping a phone number (integer) to a list of words from the
input dictionary that produce that number.")
(defun main (&optional (dict "dict") (nums "nums") (dict-size 100))
"Read the input file ĘDICT and load it into *dict*. Then for each line in
NUMS, print all the translations of the number into a sequence of words,
according to the rules of translation."
(setf *dict* (load-dictionary dict dict-size))
(with-open-file (in nums)
(loop for num = (read-line in nil) while num do
(print-translations num (remove-if-not #'digit-char-p num)))))
(defun print-translations (num digits &optional (start 0) (words nil))
"Print each possible translation of NUM into a string of words. DIGITS
must be WORD with non-digits removed. On recursive calls, START is the
position in DIGITS at which to look for the next word, and WORDS is the list
of words found for (subseq DIGITS 0 START). So if START gets to the end of
DIGITS, then we have a solution in WORDS. Otherwise, for every prefix of
DIGITS, look in the dictionary for word(s) that map to the value of the
prefix (computed incrementally as N), and for each such word try to extend
the solution with a recursive call. There are two complications: (1) the
rules say that in addition to dictionary words, you can use a single
digit in the output, but not two digits in a row. Also (and this seems
silly) you can't have a digit in a place where any word could appear.
I handle this with the variable FOUND-WORD; if it is false after the loop,
and the most recent word is not a digit, try a recursive call that pushes a
digit. (2) The other complication is that the obvious way of mapping
strings to integers would map R to 2 and ER to 02, which of course is
the same integer as 2. Therefore we prepend a 1 to every number, and R
becomes 12 and ER becomes 102."
(if (>= start (length digits))
(format t "~a:~{ ~a~}~%" num (reverse words))
(let ((found-word nil)
(n 1)) ; leading zero problem
(loop for i from start below (length digits) do
(setf n (+ (* 10 n) (nth-digit digits i)))
(loop for word in (gethash n *dict*) do
(setf found-word t)
(print-translations num digits (+ 1 i) (cons word words))))
(when (and (not found-word) (not (numberp (first words))))
(print-translations num digits (+ start 1)
(cons (nth-digit digits start) words))))))
(defun load-dictionary (file size)
"Create a hashtable from the file of words (one per line). Takes a hint
for the initial hashtable size. Each key is the phone number for a word;
each value is a list of words with that phone number."
(let ((table (make-hash-table :test #'eql :size size)))
(with-open-file (in file)
(loop for word = (read-line in nil) while word do
(push word (gethash (word->number word) table))))
table))
(defun word->number (word)
"Translate a word (string) into a phone number, according to the rules."
(let ((n 1)) ; leading zero problem
(loop for i from 0 below (length word)
for ch = (char word i) do
(when (alpha-char-p ch) (setf n (+ (* 10 n) (char->digit ch)))))
n))
(defun nth-digit (digits i)
"The i-th element of a character string of digits, as an integer 0 to 9."
(- (char-code (char digits i)) #.(char-code #\0)))
(defun char->digit (ch)
"Convert a character to a digit according to the phone number rules."
(ecase (char-downcase ch)
((#\e) 0)
((#\j #\n #\q) 1)
((#\r #\w #\x) 2)
((#\d #\s #\y) 3)
((#\f #\t) 4)
((#\a #\m) 5)
((#\c #\i #\v) 6)
((#\b #\k #\u) 7)
((#\l #\o #\p) 8)
((#\g #\h #\z) 9)))

Two notes: (1) At first, I thought I had beaten the other Lisp
programmers (and everyone else) by finishing in 1.5 hours, but then I
realized that I had cheated: the directions said to code as you would
professionally, and that means I should comment my work. It took 30 minutes
more to put in the comments and clean up a little. (2) I was
lucky in that when I was almost done, I realized that I had a bug: I
couldn't distinguish between the number 123 and 0123. I was about to
change numbers to strings (that is, 123 to "123" and 0123 to "0123"),
which would have taken 15 minutes or so, when I realized that I could
fix the bug with a one-character change: change the initialization of
n from 0 to 1, so that 123 is represented as 1123 and 0123 as
10123. With that fix, everything seemed to work fine.