Stephen Wolfram: Strong Opinions

Michael Swaine, Dr. Dobb's Journal 18 (February 1993)
105-108.
This month we continue with the conversation Ray Valdés and I had with Stephen
Wolfram. Last month Wolfram talked about Mathematica and programming paradigms.
This month he talks about science, programming, business, and why (some) mathematicians
don't like him. It's a fact that not everyone likes Stephen Wolfram. He can be exceedingly
impatient—he's never had the patience to complete any sort of formal academic degree
program, for example, although he was granted a well-deserved doctorate by Cal Tech.
And he can come across as more than a little arrogant. Whether that's an accurate
impression, though, is open to question. Perhaps he simply knows how good his mind is
and sees no point in false modesty. But when, in the conversation that follows, he claims
to know "a lot of the leading scientists in a lot of different areas," don't think for
a moment that he's exaggerating.

DDJ: You're a scientist and a mathematician, which makes you a target user of Mathematica, and you say you use it every day. But we've talked to various mathematician friends and they express mistrust for most mathematical programs, including Mathematica.

SW: I spend a part of my time doing science and I know a lot of the leading scientists in a lot of different areas. And in mathematics, for example, I think it's fair to say that the people I know who are the best mathematicians in their various areas use Mathematica, and they use it in many cases extremely enthusiastically. Now there are two sociological phenomena that go on in mathematics. One of them is this idea that computers are anathema to mathematics. This is a mistake. It's a fundamental intellectual mistake. You don't have to take that from me; just look at what's going on in the field, look at the fact that the best mathematicians use them.

DDJ: And the other phenomenon?

SW: There is this very weird thing about mathematics that's different from every other science. The way you make progress in mathematics is that you think of a theorem and generate a proof for it. It's a purely theoretical activity that proceeds in this theorem-proof mode. In every other field of science, experiment is the thing that originally drives what goes on. People don't make models and theories and work out their consequences. But one of the things that Mathematica is making possible in mathematics—and this is one of the things that good mathematicians who use it often say about it—is that you can actually do experiments. In a reasonably short amount of time you can do a reasonably nontrivial experiment and find out what's likely to be true before you go through the traditional mathematical approach.

DDJ: It's a highly respected approach.

SW: Which is, guess what's true and then try to prove it. From the point of view of a sort of mental exercise, guess what's true and try to prove it is way out there; it's one of the hardest things people can imagine doing. On the other hand, that's not the way you're likely to make the most progress, by doing the hardest thing people can imagine doing. You're likely to make more progress by making things easier for yourself. If you look at every other field of science, physics, anything else, there are a lot more experimentalists than theoreticians.

DDJ: Then mathematicians have to become more like the experimentalists in being aware of the limitations of their instruments for experiment. There was an article in The Mathematica Journal by a mathematician at Berkeley who was using Mathematica to point out flaws in some other paper. The [original paper's] conclusion was right but the process was wrong because of a floating-point error, which Mathematica didn't have. Or maybe it had a different floating-point error....

SW: No.

DDJ: OK, but they need to be aware of their instruments, it seems.

SW: That's true, but most of the proofs that are published are
probably wrong. In detail, I mean. Checking a mathematical proof is at
least as hard as debugging a computer program perfectly. The only difference
is that, with a computer program, you can run it, so you can actually
see what's happening, whereas a proof just sits there in a journal and
generations of graduate students try to [understand] it. One thing is
sure: People have to learn to use these tools well. If they're all writing
Fortran-like programs in Mathematica, that's not going to get them that
far. In terms of understanding the limitations of these tools, yes, they
have to have some concept of what's going on. Being able to predict, if
I do a calculation of such-and-such a size, will I be likely to run into
a memory chip that blows up or a bug in the program—that's hard for
anybody to assess, really. And it's certainly true in doing experiments—I know it is when I do experiments—the most likely form of error is
human error. That your program has a bug in it is much more likely than
that Mathematica has a bug in it, which is again more likely than that
the CPU that you're using has a bug in its logic.

DDJ: Which does happen.

SW: In testing out Mathematica, we found things like that. If
you look at the history of computer programming, one of the extremely
unexpected things that happened was that there were bugs in programs.
This was not anticipated. In Turing's original paper on the universal
Turing machine, the program was riddled with bugs. From the very earliest
computer program, programs had bugs. That was not something that people
expected. And it's an important conceptual thing for people to realize,
that human fallibility is at its most obvious in writing computer programs.
The mathematician who comes up to Mathematica and types something in and
gets something they don't expect and says, "That must be the right answer,
I'm going to write it in my paper," is a foolish person indeed. Because
the chances are, if they didn't expect it and can't understand it and
can't explain it, then probably there was a bug in the program they wrote.

DDJ:
But the idea of accepting the existence of bugs in programs—or proofs—somehow doesn't sound like the kind of idea that would sit well with
a mathematician. How would you characterize the acceptance of Mathematica
by mathematicians?

SW: The mathematics community is a most puristic community.
In a sense, I've been pleasantly surprised with how easily Mathematica
has been accepted in that community. There's another thing, quite honestly,
that that community has a hard time with. They sort of hate one aspect
of what I have done, which is to take intellectual developments and make
a company out of them and sell things to people.

DDJ: Probably not surprising, if mathematicians are the most
puristic of scientists.

SW: My own view of that, which has hardened over the years,
is, my god, that's the right thing to do. If you look at what's happened
with TeX, for example, which went in the other direction...well, Mathematica
could not have been brought to where it is today if it had not been done
as a commercial effort. The amount of money that has to be spent to do
all the details of development, you just can't support that in any other
way than this unique American idea of the entrepreneurial company. If
you ask a mathematician why they don't like Mathematica—it's more why
they don't like me than why they don't like Mathematica—that's it.

DDJ: There's a lot of work involved in bringing a product up
to commercial standards and making it something you can support.

SW: In the research I've been doing, one of the people who's
been working with me has developed a nice program that allows you to lay
out networks on the screen. It's a problem that I've wanted to have solved
for ten years or so, and he's got a fairly nice solution and a nice interactive
program and all that. I've talked to people about it, so people know that
my company did this program. The problem is, we didn't develop this program
in a commercial way. We developed it because I needed this thing for solving
a particular problem. Most of the things that have been developed in the
technical computing community have been developed in that kind of way,
and I realized, knowing the standards that one has for having a commercial
product that is properly supported, what incredible distance there is
between this fairly nice piece of code that does something fairly useful
and something like Mathematica. And in fact, I have a hard time even
thinking of giving it to friends of mine because I know they will expect
that, since it was produced by somebody who works for my company, it should
be a thing like Mathematica. I suppose it's obvious, but I think that
many in the mathematics community don't realize the distance.

DDJ: We know you've given a lot of thought to programming paradigms.
Do you have any opinions on the dominant paradigms of today, and about
which will survive into the next decade?

SW: I think the transformational-rule paradigm is working fairly
well. I think the functional paradigm is largely working well. I think
the procedural paradigm sucks, basically. I think the fundamental problem
with it is there's much too much hidden state in the procedural paradigm.
You have these weird variables that are getting updated and things that
are happening that you can't see. I strongly believe that there is a
way to do procedural programming that does not use hidden states. For
example, here's a thing that I'd love to be able to do: make a kind of
symbolic template of the execution history of a program. The kind of thing
that trace does—

DDJ: Right.

SW: —of taking a program that's executing and giving you back
the symbolic representation. What I'd like to be able to do is program
by saying, here's what I want—it's not quite a flowchart, it's something
beyond a flowchart—my program to be like, now I would actually do the
things that make it do this. That's kind of vague, and the reason it's
vague is because I haven't figured out how to do it. But I think one of
the directions that could be very fruitful is how you take these conceptual
ideas about procedural programming and turn them into something that's
easier to look at once you have a program. I mean, the idea of procedural
programming, of loops and so on, people have no trouble grasping. But
once they've written their programs, they have a lot of trouble grasping
what the programs do. And if one could have a more explicit way of representing
these things, one would be in good shape, I think.

DDJ: Yeah, so one rationale behind procedural programming is
that it's easy to learn. But one rationale for a hidden state is an optimization
of some sort.

SW: I don't think people need optimization any more.

DDJ: Oh, really?

SW: There are very few programs that are written for the first
time where execution speed is an issue. When you're running your word
processor, you don't want the scrolling to be slow, but that's a different
point. If you look at the history of programming-language design, almost
every major screw-up is a consequence of people pandering to some optimization,
starting from Fortran Hollerith-format statements. The trick is figuring
out how to get there, rather than worrying all the time about how you're
going to get there. Another direction that I've thought about is parallel
processing and its relationship to languages. There was a language called
C* that I made the original design for, for the Connection machine. Unfortunately,
what C* finally became was extremely far from what I had worked on. That's
one of the reasons I don't do consulting any more.

DDJ: Parallel processing and its relation to language? What's
the question?

SW: One of the questions is, are there paradigms that are applicable
to parallel programming that aren't applicable to sequential programming,
and what are they? Functional programming, list-based programming, things
like that are readily applicable to parallel systems. In fact, they work
very nicely and elegantly in parallel systems. That is indeed the main
algorithm that is used in the various [parallel] Fortrans. There is a
question, particularly with respect to SIMD architectures, of [whether]
there are other fundamental kinds of programming-language ideas that we
just don't have yet?

DDJ: Do you have an answer?

SW: Well, I've spent a lot of time thinking about them and I
didn't come up with them. It could be that there aren't any. One of the
ways that you can get a clue about this relates to the other side of my
life, which is trying to do science. The kinds of things I'm interested
in are using fundamental ideas from computation to understand more about
scientific systems. And one of the things one is led to there is what
kind of simple computational systems really capture the essence of what's
going on in a biological system, in a growing plant. Figuring
out the answer to that question has been one of my big projects. And what
I've found is that so far, with one exception that I'm still grappling
with, all of the things that I have found to be useful as fundamental
models—whether they're things like Turing machines, or cellular automata,
or register machines, or graphs—turn out to be very simple to do in
our existing programming paradigms. So one question is, is there something
out there in nature that is working according to a different programming
paradigm that we should be able to learn from? If you look at the construction
of organisms, for instance, there are many segmented organisms: biology-discovered
iteration. There are many branching organisms: biology-discovered recursion.
There are a few other of these kinds of things that are a little less
familiar but are still one line of Mathematica code, and that are commonly
used in biology. Does that mean we've really discovered all the useful
programming paradigms? And if nature presents us with something we can't
understand along those lines, that's a good clue that there's another
programming paradigm out there to be figured out. And as far as I can
tell, there isn't much else out there.

DDJ: In the second Artificial Life conference proceedings Doyne
Farmer says he's now of the view that partial differential equations is
the most general model....

SW: He is completely and utterly, unquestionably, unequivocally,
totally wrong. It's interesting you would pick him as an example, because
his responses to things that I have been right about in the past have
been as wrong as they could possibly be. As a matter of fact, I always
find it very amusing—the equations of general relativity are partial
differential equations, as you probably know, and there is this fairly
amusing thing that is said about these equations, which is that there
can be singularities, and the laws of physics as we know them break down.
Well, what does this mean? This means the partial differential equations
show singular behavior which can no longer be described by the partial
differential equations. Well, it turns out that in the same sense physics
breaks down around the space shuttle. Because the equations of fluid dynamcis
have in them an approximation that works just fine so long as there aren't
certain kinds of strong shock waves involved. When you get into hypersonic
flow, which is what happens around the shuttle as it enters the atmosphere,
the shock front has a thickness that is less than the mean free path that
molecules go before they collide with other molecules. So in this same
sense, physics as we know it breaks down. Of course it doesn't. What actually
is going on is that partial differential equations are an approximation
that turn out to be not a very good approximation in the case of hypersonic
flow. It's actually an interesting historical thing that I've been studying,
how partial differential equations ended up being thought by people to
be the fundamental equations of physics. It's very bizarre, because it
isn't true, and not only is it not true, even the fact that atoms exist
makes it clear that it's not true. So why is it that people will [say]
that the fundamental equations of physics are partial differential equations?
What happened, I think, is that when these models were first developed,
the only methods for figuring out what the consequences were was hand
calculation. Computers are a very recent phenomenon in the history of
science, and the fundamental models that exist in science have not yet
adapted to computation. And that's my next big thing.

[Editor's note: In retrospect, Ray Valdés feels he may
have overstated Farmer's position. Here's what Farmer actually said: "Connectionist
models are a useful tool for solving problems in learning and adaptation....
However, connectionism represents a level of abstraction that is ultimately
limited by such factors as the need to specify connections explicitly,
and the lack of built-in spatial structure. Many problems in adaptive
systems ultimately require models such as partial differential equations
or cellular automata with spatial structure." ("A Rosetta Stone for Connectionism,"
in Emergent Computation, MIT Press, 1991, p. 183.)]