David A. Wheeler's Blog

Fri, 07 Feb 2014

Here I want to honor the memory of William W. (“Bill”) McCune,
who helped change the world for the better
by releasing software source code.
I hope that many other researchers and government
policy-makers will follow his lead… and below I intend to show why.

But first, I should explain my connection to him.
My PhD dissertation
involved countering the so-called “trusting trust” attack.
In this attack, an attacker subverts the
tools that developers use to create software.
This turns out to be a really nasty attack.
If a software developer’s tools are subverted,
then the attacker actually controls the computer system running the software.
This is no idle concern, either; we know that computers are under
constant attack, and that some of these attacks are very sophisticated.
Such subversions could allow attackers to essentially control all computers
worldwide, including the global financial system, militaries,
electrical systems, dams, you name it.
That kind of power makes this kind of attack potentially worthwhile, but
only if it cannot be detected and countered.
For many years there were no good detection mechanisms or countermeasures.
Then Henry Spencer suggested a potential solution… but
there was no agreement that his idea would really counter attackers.
That matters; how can you be absolutely certain
about some claim?

The “gold standard” for knowing if
something is true is a formal mathematical proof.
Many important questions cannot be proved this way, all proofs depend
on assumptions, and creating a formal proof is often hard.
Still, a formal mathematical proof is the best guarantee we have for
being certain about something.
And there were a lot of questions about whether or not Henry Spencer’s
approach would really counter this attack.
So, I went about trying to prove that Henry Spencer’s idea
really would counter the attack (if certain assumptions held).

After trying several other approaches,
I found that the tools developed by
Bill McCune (in particular prover9, mace4, and ivy)
were perfect for my needs.
These tools made my difficult work far easier, because his tools managed to
mostly-automatically prove claims mathematically once they were described
using mathematical statements.
In the end, I managed to mathematically prove that Henry Spencer’s
approach really did counter the subverted compiler problem.
The tools Bill McCune developed and released made a real difference
in helping to solve this challenging real-world problem.
I didn’t need much help (because his tools were remarkably easy to use
and well-documented), but he responded quickly when I emailed him too.

Bill McCune released many tools as open source software
(including prover9, mace4, ivy, and the older tool Otter).
This means that anyone could use the software (for any purpose),
modify it, and distribute it (with or without modification).
These freedoms had far-reaching effects, accelerating research in
automated proving of claims, as well as speeding the use of these techniques.
That book’s preface notes several of Bill McCune’s
accomplishments, including the impact he had by releasing the code:

Bill McCune “deeply understood the … research developed elsewhere,
and united it with the best results of [his organization’s] tradition
[to create] a new theorem prover named Otter…
The release of Otter at CADE-9 in 1988 was a turning point in the history of
automated reasoning. Never before had the computer science community seen a
theorem prover of such awesome power…”

“perhaps Otter’s greatest impact was due to Bill’s generous and
far-looking decision to make its source code publicly available.
It is impossible to describe completely a reasoning program in research
papers. There is always some amount of knowledge, often a surprising
amount, that is written only in the code, and therefore remains hidden, if
the code is not public or is too hard to read. Bill’s code was admirably
readable and well organized. Other researchers, including those whose
systems eventually overtook Otter in speed or in variety of inference
rules, also learnt from Bill’s code data structures, algorithms, and
indexing schemes, which are fundamental for implementing theorem provers…
Prover9 and Mace4 inherited
all the great qualities of their predecessors Otter and Mace2, as witnessed by
the fact that they are still very much in use today”.

Mark E. Stickel, another developer of automated reasoning systems,
noted that,
“Bill and I were both system builders
who learned from each other’s systems.
I often consult Otter or Prover9 code to see
how Bill did things, and Bill looked
at my implementation of the DPLL procedure when developing [his] ANL-DP
and my implementation of AC-unification when developing [his] EQP.”

All too often the U.S. government spends a fortune in research, and
then that same research has to be recreated from scratch several times again
by other researchers (sometimes unsuccessfully).
This is a tremendous waste of government money, and can delay work by
years (if it can happen at all) resulting in
far less progress for the money spent.
Bill McCune instead ensured that this results got out to people
who could use and improve upon them.
In this specific area Bill McCune made software
research available to many others, so that those others
could use it, verify it, and build on top of those results.

Of course, he was not alone in recognizing the value of sharing research
when implemented as software.
The paper
”The Evolution from LIMMAT to NANOSAT” by Armin Biere (April 2004)
makes the same point when they tried to reproduce others’ work.
That paper states,
“From the publications alone, without access to the source code, various
details were still unclear… what we did not realize, and which hardly
could be deduced from the literature, was [an optimization] employed
in GRASP and CHAFF [was critically important]… Only [when CHAFF’s
source code became available did] our unfortunate design decision became
clear… The lesson learned is, that important details are often omitted
in publications and can only be extracted from source code. It can be
argued, that making source code … available is as important
to the advancement of the field as publication.”

More generally,
Free the Code.org
argues that if government pays to develop software, then it should be
available to others for reuse and sharing.
That makes sense to me; if “we the people” paid to develop software,
then by default “we the people” should receive it.
I think it especially makes sense in science and research;
without the details of how software works, results are not reproduceable.
Currently much of science is not reproduceable (and thus not really science),
though open science efforts
are working to change this.

I think Bill McCune made great contributions to many, many, others.
I am certainly one of the beneficiaries.
Thank you, Bill McCune, so very much for your life’s work.