Lots of interesting threads to respond to. I was going to write more
tonight, but instead I spent over three hours talking with David
McCusker on the phone. It was well worth it, as he was able to give
me some new ideas about trees in Fitz, and I explained to him my
trick for representing arbitrary trees in a btree-like structure.

Garbage collection

mwh wonders about terminology. I guess there are people who
consider reference counting to be a form of garbage collection, but I
tend to think of them as two different approaches to the same problem.
To me, a garbage collector is something that can collect garbage even
when it has cyclic references. I guess it all depends on who is master
:)

By this criterion, Python 2's "cycle detector" is a real garbage
collector, but there is also reference counting. When a reference
count goes to zero, a structure can be freed immediately, which is not
true of "pure" garbage collected runtimes. In fact, if you're careful
not to create circular references, you might never need to invoke the
cycle detector.

Essentially, the hack in Python is to use reference counts to
determine the roots. In any garbage collected runtime, finding the
roots is a challenging problem. The problem is even harder when code
is in C, because a lot of the time, you'll want references stored on
the stack to be considered roots. Python counts the number of
references in to each object, and if this doesn't match the
reference count, considers the object to be a root.

I haven't done benchmarking, but my intuition tells me that Python's
approach is one of the poorest performing. For one, you have 4 bytes
of overhead per object to store the reference count. For two, as mwh
points out, just forgetting a reference pulls the object into the
cache. For three, bumping reference counts causes memory traffic. In
many cases, you'll get write traffic to main memory even when you're
just doing a read-only traversal. For four, I'm sure that Python's
cycle detector is slower than most mark-sweep collectors, because of
all the extra accounting it has to do.

mwh guesses that reference counting is better performing than stop and
copy. Actually, stop and copy tends to be one of the best performing
approaches. Allocation is very cheap, in the common case just
advancing a free pointer. On collection, only active objects have any
cost. In most systems, the majority of objects have very short
lifetimes. For these, the cost of freeing is essentially zero.
Finally, stop and copy often compacts objects, increasing spatial
locality, so you may need to pull fewer cache lines from main memory
to access an structure. But note that Hans Boehm doesn't agree with
me. See his slides
on GC myths, for example, and this comparison of
asymptotic complexity.

mbp: Java's example is not particularly helpful for what
I'm trying to do. My goal is to have a library that works well when
bound to whatever runtime the client wants. The Java philosophy, by
contrast, is to nail down a runtime and force all components of a
system to conform to it. In many cases, that's actually a good thing.
It minimizes the impedance mismatches between modules. Further, Java's
runtime is actually a pretty good one. A huge amount of effort has
obviously gone into tuning its performance, and I'm impressed that
they've gotten garbage collection in highly multithreaded environments
to work so well. Java's runtime is undoubtedly better than most
hand-rolled stuff.

I'm not convinced by your claim that Java is agile with respect to
garbage collection discipline. I consider the ability to count, say,
only forward references in a doubly linked list to be a minimum
requirement for any decent reference counted discipline. Where is that
in Java? I know that embedded Java folks are having success with
coding patterns that allocate very little memory, but at the cost of a
more constrained use of the language. In particular, if you use a
library written for general Java in such an environment, it won't work
well. Conversely, general Java users probably won't much like
libraries designed for embedded stuff, because they're used to a
dynamic style.

Other stuff

I have things to say about proofs vs testing, debuggability of
runtimes, and media coverage of Hatfill, but they'll all have to wait
for another day.

A 7.30 release is pre-announced.
This one contains the notorious DeviceN merge, but isn't the most
stable release ever. If you're interested in DeviceN, you'll want
it.

I'll be at Seybold SF over the next three days. I don't feel entirely
prepared for it, but it's always a lot of fun to meet customers and
users, and to see what else is going on in the industry. It'll be a
bit strange being there on the 11th. This anniversary touches a pretty
deep chord for probably all Americans. I was a bit surprised to have a
dream about it a few days ago. May everyone have a peaceful Wednesday.

Testing and proofs

In a recent phone conversation, Peter Deutsch quoted Dijkstra's famous
saying, "testing can only show the presence of bugs, never their
absence." Dijkstra was arguing for a more systematic approach to
programming, in particular proofs of correctness. I agree, but here we
are in 2002 with extremely poor tools for writing and deploying
provably-correct programs. Thus, my response is: "without testing, you
don't even have a way to detect the presence of bugs."

I'm now more convinced than ever that thorough testing is the most
practical route to high quality software. We're doing a lot of
regression testing on Ghostscript now, and increasing it still
further, but these tests are still not what I'd call thorough.

I think that one of the best things from the Extreme Programming
movement is the idea that testing is baked in to the development
process. I plan to follow this deeply for Fitz development. Among
other things, I want to test for memory leaks (relatively easy if you
just count mallocs and frees), UMR's and other runtime flaws (which
means running tests under Valgrind), memory usage, speed, and of
course consistent rendering. The latter is, I think, especially
important when doing hardware accelerated rendering, such as with
XRender. Testing that the results are consistent with unaccelerated
rendering is quite straightforward.

Ultimately, I would like to prove Fitz correct. In principle,
this shouldn't be all that hard to do. John Harrison's thesis shows how to do serious work with reals in the HOL
framework. It's not hard to imagine continuing this work into
computational geometry concepts such as line segments, polygons, and
Bezier curves.

Even in the context of proving programs correct with respect to a
formal specification, I think intensive testing will usually be
required, if for no other reason than to validate the specification. I
find flate (zlib) compression a useful example. One aspect of the spec
is trivial: decompression composed with compression should be the
identity function over streams. Yet, it's also important to show that
the compression is compatible with existing implementations. I'd have
almost no confidence in this just from eyeballing the formal spec. The
only real way to do it is to test it. And, as I say, I consider flate
compression to be one of the problems most amenable to a formal
approach. It's impossible to imaging doing something like a
Word-to-PDF converter without intensive testing.

Conscious runtime design

Chris Ryland sent me an email pointing out that Python's runtime was
consciously designed to meet certain goals, including runtime
compatibility with C. Indeed, Python is one of my main inspirations.

I think it's possible to argue that Python got one thing wrong, which
is its choice of memory freeing discipline. Python 1 is reference
counted only, and Python 2 adds garbage collection. The result is more
complex, and probably slower, than one discipline or the other alone.
Of course, for something like Python, reference counting only has the
serious disadvantage of leaking memory whenever programs create
circular data structures.

Memory freeing discipline is one of the most difficult problems for
runtime design. There is no one approach which is obviously best. In
particular, reference counting is definitely imperfect. Aside from the
problems with circular data structures, it's often not the fastest
approach. Garbage collection is often faster, largely because you
don't have the overhead of constantly bumping the reference count.

But reference counting has one killer advantage: it's easy to
integrate into other runtimes. In particular, a refcounted runtime can
be slave to a garbage collected runtime, knitted together through
finalization calls. By contrast, knitting together two garbage
collectors is quite hard. And, in doing a graphics rendering
library rather than a full programming language, I can simply avoid
creating circular structures by design. Thus, I've arrived at
reference counting as the best choice.

I can think of one alternative, which is appealing but not practical
given today's technology. Imagine a language that is designed
explicitly to make runtime-compatible libraries. The compiler will
then generate different code for the same source, depending on the
target runtime. Thus, the generated code for Python 2 would have both
reference counting and garbage collection tracing methods. The
generated code for Ruby would have just garbage collection. Another
target might assume the Boehm-Weiser collector, and yet another might
just have reference counting. Maybe an Apache module target does
allocations in pools. Obviously, writing source that can be translated
into high quality code for all these runtimes would take some care,
but I don't see it as impossible.

I think such an approach has lots of advantages, but for obvious
reasons I don't want to put the design and implementation of such a
language on the critical path for Fitz.

Psyco

Earlier, I wrote expressing skepticism that Psyco would achieve its
goals. Happily, it looks like I'm being proved
wrong.

In fact, if it does work (which it's beginning to look like), the
Pysco approach has some definite advantages. Primarily, you don't need
to complicate the language to add static annotations. Rather, if
something is always the same type (or the same value, or whatever),
it's detected dynamically.

The fact that people are seriously exploring different approaches to
performance in very high level languages is marvelous.

I got a nice email from Michael Norrish, suggesting that maybe HOL is
not as difficult as I suspected. The core of HOL
Light is only about 1000 lines of ML. That's an appealing level of
simplicity. Of course, ML is particularly well
suited for doing HOL-style logic, but that hardly counts as a
strike against it!

Without meaning any disrespect for other projects such as Mizar, QED,
Mathweb, NuPrl, etc., for the stuff I'm interested in, HOL and
Metamath are definitely the two systems of interest. They're both
simple, they're both founded on mature work in mathematical logic, and
there's enough mathematics done in both to make comparisons useful.

Unfortunately, the two systems are different enough to pose major
compatibility problems. At heart, functions are different beasts. In
ZF(C) set theory, they're essentially untyped, and allowed to be
partial. They can be hugely infinite, but can't take themselves as
arguments. Thus, even such a simple and familiar expression such as
lambda x.x can't be directly expressed as a ZF set.

HOL functions, on the other hand, are typed, but total within that
type. Thus, the function lambda x. 1/x is problematic. Harrison's
construction of the reals in HOL arbitrarily takes 1/0 = 0. The trick
of polymorphic types lets you apply a function to itself, with just
enough limitations to avoid the paradoxes. So lambda x.x is back in
the club.

I get the feeling that there's been work on connecting the two flavors
of formal system, but it seems technical and difficult to me. This is
disappointing.

In any case, I feel this interest in logic burning itself out for
now. I'm sure I'll come back to it later; it's been an interest of
mine for a long time. I still feel that a Web-centered distributed
proof repository could be a very interesting thing, and that the
tools I'm comfortable with could be applied well. But what I do not
need right now is another big project.

Ghostscript

The big DeviceN merge happened, and it's not too bad. There are
still some regressions, but they're settling down. A big chunk of my
recent work has been putting together a framework to test whether
6-color DeviceN rendering is consistent with plain CMYK. I'm becoming
an even firmer believer in testing, and I think that graphics
rendering systems are particularly suitable for automated testing.

Fitz

I'm craving a chunk of time to sit down and write Fitz design ideas,
but with the Ghostscript 8.0 release process taking all my time, it's
not been easy. Even so, some aspects of the design seem to be coming
into relatively clear focus, even as details on things like font and
text rendering seem elusive. I'm particularly excited about the design
of the runtime discipline. In particular, it should be a lot
easier to understand and integrate with other systems than the current
Ghostscript runtime discipline. I don't see a lot of work on conscious
design for runtimes; more commonly, designs inherit runtimes from
somewhere else (language, library, predecessor). As such, I'm hoping
that the runtime document will make interesting reading even for
people not in the market for a graphics rendering library.

jeremyw is correct, of course. Simply saying "permission to
quote is granted" or something similar is almost as concise, and far
clearer to most people.

Metamath

The second big problem with Metamath is the fact that it nails down
specific constructions. It's important to note, however, that this is
a characteristic of the set.mm database, not the Metamath framework
itself.

The solution to this problem, I think, is to use "theory
interpretation." Instead of saying, for example, "0 is the empty set,
and x + 1 is x U {x}", you say, "for all sets N and functions +, -, *,
/, whatever satisfying the usual axioms for arithmetic." Then, if you
ever need to show the existence of such a thing, for example to
use arithmetic as a temporary while proving something else, you can
use the standard construction, or any construction for that matter.

The question is how much more awkward this will make proofs.
Metamath's choice of a definite construction is good for overall
simplicity. I guess there's really only one way to find out.

It is considered good etiquette not to quote email without permission.
However, these days, emails are often part of a broader discussion
spanning blogs, web fora, and so on. It's increasingly easy to run
afoul of this etiquette rule. Conversely, it seems awkward to say
something like "permission granted to quote" when I respond to, say, a
blog entry in an email.

Thus, I propose the two-character string "+ as a shorthand indicating
that permission to quote, with attribution, is granted. Permission is
also granted to integrate the email into any copylefted documentation
(most copylefts do not require attribution).

Analogously, "- is a request that the email not be quoted, perhaps
because it contains confidential or sensitive information. Lastly,
when someone is known to be familiar with the convention, "? asks
whether quoting is permitted.

Bram pointed me to this thesis
on implementing real numbers within HOL. I heartily recommend this
thesis to people following this thread (if any). It's very interesting
to compare to Metamath's construction of the reals. Unfortunately,
these constructions are not compatible. One significant difference is
that Metamath seems to support partial functions, so 1/0 is not in the
range of the divide function, while HOL wants to have total functions
within the type, so 1/0 must have some value (Harrison chooses 0
to simpify the details). As such, proofs from one probably can't be
easily ported to the other without serious handwork.

I feel I understand Metamath
reasonably well now. It has some issues, but it's overwhelming
strength is that it's simple. For example, I believe that a
fully function proof verifier could be done in about 300 lines of
Python. I wonder how many lines of Python a corresponding verifier for
HOL would be; I'd guess around an order of magnitude larger. That kind
of difference has profound implications. Norm Megill is certainly to
be commended for the "simplicity engineering" he's put into Metamath.

For the purpose of putting doing Web-distributed proofs, Metamath has
a few shortcomings. I think they can be fixed, especially given the
underlying simplicity. I'll talk about these problems and possible
fixes over the next few days.

Definitions in Metamath have two closely related problems. Definitions
are introduced exactly the same way as axioms. As such, it's far from
obvious when a definition is "safe". For example, you could add
definitions for the untyped lambda calculus, which would introduce the
Russell set paradox. The second problem is that there is a single
namespace for newly defined constants. You wouldn't be able to combine
proofs from two different sources if they defined the same constant
two different ways.

Here's my proposal to fix these problems. Choose a highly restricted
subclass of definitions that is clearly safe. For example, you could
say that any definition of the form "newconst x y z = A" or "newconst
x y z <-> phi", with newconst not appearing in A or phi, is
acceptable. I propose to introduce new syntax that clearly identifies
such definitions. You could use existing syntax, so that such
definitions become axioms, but can be checked easily, or you could
have other syntax that sets the new constant apart from its "macro
expansion". That's a style preference.

Now let's talk about namespaces. I have a strong preference for using
hashes as global names, because (assuming the hash function is
strong), you don't get collisions. As such, it should be possible to
mix together arbitrary proofs without danger. Here's an outline
proposal.

Take the definition axiom, and replace the newly defined constant with
some token, say $_. Hash the result. That is the "global name". When
you're developing proofs, you'll probably want a (more or less)
human-readable "pet name", but this is actually irrelevant for
verification. Here's an example in Metamath notation.

$( Designate x as a set variable for use within the null set definition. $)
$v x $.
$f set x $.

$( Define the empty set. $)
dfnul2 $a |- (/) = { x | -. x = x } $.

So here's what gets hashed:

$a class $_ $. $f set x $. $a |- $_ = { x | -. x = x } $.

Take the SHA-1 hash of this string. Then I propose that
#274b1294a7d734a6e3badbf094190f46166159e4 can be used (as both a label
and a constant, as these namespaces are independent) whenever the
empty set is needed. A proof file would of course bind this string to
a shorter name, such as (/). When importing a proof file from another,
the binding would be local to the file. (Currently, Metamath has only
a file include facility similar to C's preprocessor #include, but an
import facility with better namespace management would be quite a
straightforward addition, especially considering that Metamath already
has ${ $} scoping syntax).

Obviously, there are some details to be worked out, particularly
nailing down exactly what gets hashed, but I think the idea is
sound.

Schooling

Alan's Mindstorms arrived a couple of days ago. These promise to be
quite fun (and of course educational :). So far, he's settling into
first grade very easily. We begin the half-homeschooling starting on
Monday.

Even so, I get the sense that Max is going to be the one most into
computers. He's learning the OS X interface impressively well. Last
time we resumed the computer, a folder was highlighted, and he said,
"it's clicked." Then, when I ran Stuffit to unpack a game demo, he
noted the icon and said, "it's squishing it." He's also the one that
said, "I love Alan's 'puter".

I have great hopes for word-based (or "Bayesian") scoring. Paul's
arguments for why it will work seem convincing to me. In particular,
keeping per-user word lists should help enormously with two big
problems: the ability for spammers to "optimize" around a common
scoring scheme; and differing opinions about what constitutes spam.

I still think there may be a role for trust, but it's also possible
that scoring by itself will work so well that adding trust isn't
really necessary. In any case, I'll do my best to write up my ideas
for using trust to beat spam.

trust

Bram and I had a great discussion about anti-certs. He is
exploring the space of adding anti-certs to the trust computation, but
is finding it complex. Many of the proposals would seem to require
vastly greater computation than the simple eigenvector and network
flow models I've proposed.

The leading alternative to that seems to be a way to use anti-certs as
input to a process which removes positive certs from the graph. If you
disagree with the ratings of user X, it might be interesting to
analyze the influence of X on ratings transmitted to you through the
graph, then remove your local edges which carry the most influence
from X. In general, this boils down to optimizing edge weights.
Lastly, anti-certs don't have be about individual users (nodes in the
graph). They can be about ratings you disagree with. You don't really
have to know where the bogus ratings come from, as long as you know
how to tune your local edges to minimize them.

As always, a big part of the challenge is presenting a sensible UI.
I've made the Advogato UI for certs as simple as I know how, and the
user community is supposed to be sophisticated, yet it seems
that many people can't manage to do certs and ratings the way they're
supposed to be. Bram's anti-cert UI is straightforward to
implement. In addition to "Master, Journeyer, and Apprentice", you'd
just add one or more negative categories.

I have more insight into the lambda definition yesterday. I believe
that the definition of (lambda x x) is correct - it should be
equivalent to the identity
relation defined in Metamath. However, it is quite definitely a
class and not a set. As a result, trying to use it as a function
argument just yields the empty set.

This is probably a reasonable way out of the logic thicket. The
problem is that the untyped lambda calculus is subject to the Russell set
paradox.

I was talking about this with Bram, and he called ZF set
theory a "hack." I more or less agree, but playing with it in the
context of Metamath has led me to appreciate how powerful a hack it
is. With a small number of relatively simple axioms, it gets you a
rich set of infinities, but avoids the particular ones that bring the
whole formal system crumbling down. You get integers, reals, tuples,
sequences (finite and infinite), transfinites, and functions from set
to set. You don't get untyped lambda calculus. Overall, it's probably
a good tradeoff.

I believe that the formulation of lambda I was wondering about yesterday is:

$d y A $.
df-lambda $a |- ( lambda x A ) = { <. x , y >. | y = A } $.

This lets me prove such things as ( ( lambda x x ) ` y ) = y. I'm probably not at the full power of the lambda calculus, though, because here "y" is required to be a set, while a lambda term is a class. So it's not clear I can apply this lambda term to itself.

School

Tomorrow is Alan's first day of 1st grade. It promises to be exciting.

I've been chewing over Metamath in my head
some more. It's very appealing in lots of ways, especially its
simplicity, but a few things need to be done differently, I think, to
make it work as a distributed, web-based proof repository.

The biggest problem is namespaces. On the web, everybody has to be
able to define their own local namespace, while also being able to
refer to other things in the world. Metamath, by contrast, uses short,
cryptic, and impermanent names for its theorems. On that last point,
set.mm includes this note: "While this file is complete and correct,
it may undergo revisions from time to time (including theorem name
changes, which means any new theorems you add may not always remain
compatible)." That won't do.

Actually, I'm pretty convinced by now that the best name for a theorem
is the theorem itself. If the size of the name is a concern, a hash
can be substituted. However, an advantage of using the whole theorem
is that, if a correct proof cannot be found on the web, someone can
probably prove it by hand.

The namespace of newly defined keywords is also problematic. In fact,
Metamath does not automatically check whether definitions are
ambiguous. Having a definition collision could lead to contradiction,
which would be bad. For definitions that simply expand as macros, I
suppose you could use the expansion (or a hash thereof) as the
permanent name. However, not all definitions in Metamath have this
form.

The second big problem, as I've mentioned earlier, is whether to keep
theorems abstract, or pin them down to a single model. Metamath does
the latter, which helps keeps theorems simple, but it also makes them
nonportable to other models. I think it's possible to do
abstract theorems in Metamath (you quantify over all possible models
that satisfy the axioms), but probably notationally very cumbersome.
Even so, it's similar to the way a lot of mathematics is done: you see
a lot of theorems over, say, fields, without them pinning down
which field is meant.

I'm still not sure whether formulae should be represented as strings
of symbols (subject to implicit parsing through wff rules), or as
trees. My gut feeling is the latter, because it sidesteps the issue of
whether the parsing grammar is ambiguous. The former is a slightly
simpler representation, though, and also closer to human-readable
mathematics.

There are a few stylistic choices which are unusual, but probably
reasonable. For one, the only primitive inference rules are the Modus
Ponens and generalization. Equality and substitution are handled
through theorems just like everything else. For another, instead of
classifying variables as "free" or "bound", Metamath just places
distinctness constraints on variables. A lot of the primitive
operations such as substitution
are cleverly defined to place no more distinctness constraints than
necessary.

By the way, does anyone have a formal spec for lambda? Here's what I
have, but I don't know if it's right: