On the quality of available Prolog implementations

I am grateful to the editor, Pat Hill, who gave me the opportunity of
reading a preliminary version of the interesting article of Paulo
Moura. Reading it had the effect of revamping my feelings of
dissatisfaction for all the Prolog implementations I have tried. But,
before continuing: to avoid misunderstandings, I feel compelled to
make clear my reasons for contributing this article with the following
disclaimer:

I am a definite supporter of logic programming in general
and of Prolog in particular. It precisely because of this
that I am extremely distressed by what are, I believe,
serious deficiencies in current Prolog implementations.
By exposing some of these deficiencies, it is hoped that
this note might actually help improve things.

I would like to see our community to be able to stand up to other
language communities. However, I feel that we would probably succumb
in any serious discussion about the relative advantages of our
programming environments (which include languages, implementations,
and tools) with respect to others. And this also in those application
fields which make Prolog an excellent candidate. How sad. We must do
something about it.

One of the discoveries I made thanks to Paulo's article is that there
are at least 22 Prolog compilers around here. This is an amazing
number even though some of these systems are test beds for academic
research. To me this sounds like an indication of the fact that the
Prolog community suffers from a an excessive degree of fragmentation
which, in my opinion, is not a good thing. Given the differences in
the languages actually implemented by those systems, as pointed out by
Paulo, one of the results is that we have many different communities
that cannot share code easily. I suspect also that a significant
number of these communities are really small, if not marginal at all.

What are the reasons for this fragmentation? Lack of standardization
has certainly played an important role. Weakness of standardization
is also an issue: when a standard leaves too many thing unspecified or
"implementation-defined", this is an open door for the birth of
several incompatible systems.
Another phenomenon I have observed quite frequently in our field, is
the rush with which implementations go commercial. No matter how
immature and buggy a new Prolog implementation might be: chances are
that the developers will try to sell it. Or they will not distribute
the source code. Or both. Usually this results in yet another
incompatible Prolog system with its own bells and whistles, with only
a few users and with a lifetime of one to two years. Whether the
developers do actually make any significant money out of it is
something I ignore, but I doubt. I also doubt that going commercial
makes any positive contribution to quality. Forbidding access to the
sources certainly does not help. Maybe the lack of standardization
and the rush to go commercial constitute a vicious circle: in absence
of a strong standard I could always come up with my own Prolog system
offering a peculiar feature that nobody else offers (without caring so
much about being compatible on the standard features). To profit from
my commercial implementation I could still use the old, short-sighted,
and wrong strategy of placing obstacles on the emergence of a
standard, since the standard could make me lose part of my little user
base. The technique of tying users to a product by means of
incompatibility is a really old one. Other communities have learned
how to avoid this trap a long time ago.

I don't even know the names of all the 22 Prolog implementations
mentioned by Paulo. I have started using Prolog in 1984 with C-Prolog,
and since then I have probably tried a dozen or so implementations. I
must say that I am dismayed at how little things have changed since
1984. While the operating systems and, say, the C development
environments I was using at that time were pathetically limited
compared to what software technology is able to offer today, C-Prolog
of 1984 could have its place among the Prolog systems available in
1999. It came with the usual interactive debugger and a couple of
standard tools, like XREF, that came in handy from time to time. A
very poor development environment indeed, but not that far from the
kind of environment you get with most of today's Prolog systems. Let
us consider a trivial example: command line editing and history. It
is only recently that some Prolog systems (like SWI-Prolog and
B-Prolog) offer command line editing and history. Most systems
(including SICStus Prolog which is considered by many to be one of the best
systems available) do not offer this facility: if you make a typing
mistake you have to retype everything; if you want to issue a similar
query again you have to retype everything. Isn't that striking in
1999?

In my opinion, the most important kinds of deficiencies in today's
Prolog implementations are:

bad documentation;

bad diagnostics;

very old and inadequate implementation technology.

Let us start from the problem of documentation. I am particularly
sensible to this problem because I work on program analysis. And in
order to develop a correct analyzer I need a correct and precise
description of the language being analyzed. This turned out to be
very difficult if not impossible to obtain. A few months ago I posted
a request for clarifications about the SICStus module system (reading
the relevant chapter of the manual should suffice for anyone to
understand why I was in need for explanations). I have received
several answers (including, several weeks later, the one from the SICStus
officials). All the answers dealt with the problem of how to use or
take advantage of the module system. None was an answer to my
problem, that is, understanding the exact semantics of the system.
A colleague who is in the field of Prolog programming and Prolog
implementation tried to answer my questions and said: "I see your
problem - I have always avoided trying to know the exact semantics,
because I suspected it to be changing every release. Very unfortunate
of course."

With all the work we have done in the field of semantics for logic
programming (several hundreds of papers) it is very sad to find ourselves
in such a state. I mean, we are able to give a formal specification
for a module system or for a built-in, aren't we?

What we have seen above is just an example of ambiguous documentation.
However, it is quite often the case that seemingly unambiguous
documentation does not match the system at hand. Take your favorite
Prolog implementation and, after reading the documentation, try the
following queries:

":- [a|b]."

":- [[[a|b],c]|d]."

Use the above lists as the list of operator names in op/3.

You may be surprised of the result. Very recently I have encountered
yet another case of non-adherence of the system to its
documentation. I was semi-automatically testing the (partial)
correctness of my descriptions for the SICStus built-ins, when the
rudimentary tool I have built for that purpose told me that the
description I used for portray_clause/1 was wrong.
I quickly rushed to
page 83 of the SICStus Prolog User's Manual (version 3.7.1) and there
I found:

portray_clause(+Clause)
portray_clause(+Stream,+Clause)
Writes the clause Clause onto Stream exactly as listing/(0-1) would
have written it. Same as write_term([Stream,] Term,
[quoted(true),numbervars(true),indented(true)]) followed by a period
and a newline, removing redundant module prefixes and binding variables
to terms of the form '$VAR'(N) yielding friendlier variable names.

Then I went to page 5 of the manual where I verified that +Clause
means that Clause "should be instantiated to a non-variable term."
But this was in perfect agreement with my description of that predicate.
So, as usual, I resorted to manual experimentation. Here is what I got.

In words, it is not true that, for portray_clause/1 to succeed, the
argument should be a non-variable and it is not true that
portray_clause/1 is equivalent to write_term/2
with the indicated list of options.
Moreover, that call of write_term/2 dies badly with an overly obscure
error message. This brings us to the topic of bad or non-existent diagnostics.
What we have just seen is an error in the implementation
of write_term/2. This uses functor/3 without first
checking the obvious applicability conditions. But this case is easy:
we are evaluating a built-in with respect to an empty program. If you want
another one, try

Now imagine testing a big application: dozens of source
files, hundreds of predicates defined and maybe fifty calls to
functor/3. One of these calls, unfortunately, goes wrong. Well, you
will get a message like

{INSTANTIATION ERROR: functor(_29,_30,1) - arg 2}

and nothing else. You won't get any help in locating which call of
functor/3 went wrong: neither the file or module, nor the predicate,
nor the clause, let alone the position within the clause. In other
words, run-time errors come without any indication about which clause
was executed when the erroneous situation occurred. And this even if
you are executing interpreted code. This is astonishing. Especially
if you consider that people programming in C (or C++, or Fortran, or
Java, or Pascal, or even assembly for that matter) can download for
free a programming environment with which they can know exactly which
line of code in which file caused, say, a segmentation fault. And they
can obtain a complete stack trace which makes it clear who called
who and with which values.

I must agree with Paulo when he says that often commercial providers
try to sell bugs for features. And they react badly when they are told
that their documentation is full of incongruities and does not
faithfully describe the actual behavior of the system. As a result I
no longer report bugs to them. I remember with pleasure when I
reported a bug in the debugger of a major Prolog system: I produced a
test-case whereby the program behaved differently depending on whether
it was executed with tracing on or off. After two or three messages
where the providers tried to divert the attention from the real
problem the final answer came. It was constituted by the sentence

"The debugger does not have the property you are requesting."

Well, I was
simply requesting the debugger not to be intrusive, which should be
number 1 in the list of features any debugger must have. Moreover,
that feature was provided in free debuggers for C (where achieving
this is a bit more difficult just because anything can get access to
anything) 10 years ago. Isn't that gross?

The last problem I want to address in this note is the lack of high
quality tools for Prolog programming. Let us focus on the most basic
component of a programming environment: the compiler. I don't know why
but we are basically using compiler technology that is 15 years old.
Not only does it seem that data-flow analysis is never used, but even
classical optimizations, like common subexpression elimination and inlining
appear to be disregarded. A couple of years ago I made some tests whose
outcome was: either I had to give up first argument
indexing or perform unnecessary unifications. The test-case
was something like

I don't know the situation with this today (*).
What I do know is that for
C it is ages since it was advantageous to write '++i' (translated into
an INC instruction) instead of 'i = i+1' (Jurassic compilers
translated this into a generally slower ADD instruction.) In
contrast, Richard O'Keefe's adagio "put output unifications after the
cut" makes still much sense today. So, as far as I can tell, the code
quality of most Prolog compilers is not very good (which is not
surprising since they are mostly macro-based code generators, rather
than optimizing compilers). The other thing I tend to expect from a
compiler is good diagnostics, but the detection of singleton variables
is all the Prolog compilers I know have to offer. In other words, I
will know only at run-time if I have misspelled the name of a
non-dynamic predicate. It is lucky that I no longer develop critical
applications.

I could write a book on this subject, but will stop here. The
intended audience for these notes knows very well what I am talking
about and can guess with good precision what can be added. I am also
worried about the fact that what I have written can hurt somebody. I
am upset by the absurd situation in which we have put ourselves. I
have observed the evolution of Prolog and Prolog environments for 15
years now. In silence. And this is my fault. Now I feel it is the
time to look at the situation lucidly and without pity. And the
situation is that advocacy of Prolog programming is no longer
possible, at least for me. Arguments that used to work are no longer
valid. Try saying to a C++ programmer "with Prolog you don't have to
worry about pointers, memory allocation, and garbage collection." He
will answer that he isn't worried about that either, because he can
design his classes to achieve the same results. He has access to a
variety of generic garbage collectors that do not interfere with his
program and still has all the tools he can dream of. If he then
discovered that we don't have the slightest idea about which call of
functor/3 went wrong he will laugh at you, and rightly so. And what
about the "simple semantics" argument? I wouldn't dare try to use it
today. I am too scared of the possibility the other guy says:

"Yeah, right. And what do you do with that simple semantics?"

In summary, there is more than enough for us all to wear a brown
paper bag for a year or so. However, this is not a very productive
thing to do. It would be much better to finally profit from the
strong points of Prolog without continuing damaging ourselves because
of fragmentation, poor software engineering practices, and so forth.
I am sure users, researchers, and providers know exactly what
I am talking about. The "simple semantics" can really be profited
from and "Programming in Logic" should no longer be an excuse for having
less tools at our disposal than those "Programming in Assembly".

In conclusion, I would like to add something to the discussion on whether
the ISO Prolog standard has been taken seriously so far. The following
is taken (without removing any relevant context) from the official pages
of EGCS (the Experimental GNU Compiler System,
http://gcc.gnu.org):

Now that there is an agreed ISO/ANSI standard for C++, the compiler
has a definitive document to adhere to. Earlier versions might have
accepted source code that is no longer C++. This means that code which
might have `worked' in a previous version, is now rejected. You should
update your code to be C++.

Straight, timely, and unambiguous: that is what I call taking a standard
seriously.

A few hours before the final version of this article was handed in a
message from Mats Carlsson (answering a message from Robert
P. Goldman) clarified that the example I gave for the choice between
first argument indexing and avoiding useless unifications does not
apply to today's SICStus. According to Mats, "If all arguments are
distinct variables A, B, C, ..., and the
first goal is "A = Term", indexing will be done on (the
principal functor of) Term." As far as I can tell, this
means that if at least one argument is not a variable, or if two
arguments are bound to the same variable, or the first two goals are,
e.g., "true" and "A = Term", then indexing will
not occur. In other words, non-optimized macro-based code
generation is what is done here.