A New Kind of Science and the Future of Mathematics

Well, what I'd like to do this morning is to talk about some things I've done
over the course of many years—that I think have some significance for
mathematics and mathematicians. Mostly I'm just going to sketch things; if you
want details, they're mostly in that big book.

Well, I guess it all started with a little computer experiment that I did at
the beginning of the 1980s. That I'd like to show you.

Well, I'd been working on some basic problems in physics, and a little bit
also in biology. And I'd wanted the simplest possible models. And that led me to
a particular kind of very simple model—that turned out to be a version of what
had been called a cellular automaton.

Here's how a one- dimensional cellular automaton works. It's got a line of
cells, each either black or white. [See A New Kind of Science, page 24.]

And it evolves down the page, in discrete steps. [See
A New Kind of Science, page 24.]

With the color of each cell at each step being determined by a fixed rule—from the color of the cell and its neighbors on the step before. So that if a
sub i is the color of the ith cell at step t:

Well, you can represent that rule by an icon that just shows what happens in
each of the 8 possible cases. [See A New Kind of
Science, page 24.]

So starting off for example from a single black cell, this rule just gives a
simple uniform region of black. [See A New Kind of
Science, page
24.]

OK, let's try changing the rule a bit. Here's another example. [See
A New Kind of Science, page
25.]

This gives a checkerboard pattern.

So far none of this is at all surprising. We've got very simple rules, and
we're getting very simple behavior out.

So is that always how it works? Well, take a look at this. [See
A New Kind of Science, page
25.]

It doesn't seem to repeat. Let's run it longer. [See
A New Kind of Science, page
26.]

Well, it's a very regular nested fractal pattern.

And perhaps there's a general result: that if the rule and then initial
conditions are simple, then the behavior will always have to be correspondingly
regular and simple. Well, at least back in 1981 that's what I assumed was true.

But one day I decided to do a systematic experiment. And just to run every
single one of the possible simplest cellular automata.

There are 256 of them, and you can number their rules like this.
[See
A New Kind of Science, page
53.]

And here's what they all do. There's quite a bit of diversity. But mostly the
behavior looks pretty simple. Uniform, or repetitive, or at least nested.
[See A New Kind of
Science, page
55.]

But if you look carefully, there are some cases where other things seem to be
going on. The first one is rule 30. [See
A New Kind of Science, page
27.]

So let's take a look at that.

It seems quite complicated. It's asymmetric because the rule was. Let's try
running it a bit longer. [See A New Kind of
Science, page
30.]

Well, there are a bunch of periodic stripes over on the left. But otherwise,
this looks really complicated. Actually kind of random.

Well, when I first saw this, I thought it must just be a problem with our
visual system. That somehow there is regularity, but our visual system just
doesn't pick it up.

So I started trying all sorts of statistical and mathematical methods on it.
And there are certainly lots of things one can say about rule 30. It's an onto
map. Small perturbations always grow. It has a certain spectrum of periodic
points.

And in the pattern here a depth d diagonal has period at most 2d. On the
left, the period's much much lower. But on the right, it's close to the bound.

And let's say one looks at the center vertical column of cells. One can
imagine using it to make a (quotes) "random walk". Here are the first million
steps. [See A New Kind
of Science, page
871.]

It certainly doesn't ever seem to repeat. And if one adds in an occasional
neighboring cell, one can prove that. And in fact it seems to pass every
standard statistical test of randomness. Including cryptanalytic ones.

And actually we've been using essentially this sequence to make random
integers in Mathematica for the past 16 years. And it's passed zillions of
tests—even while many other proposed pseudorandom generators have
failed. [See A New
Kind of Science, page
30.]

But still, rule 30 is so simple underneath, one might figure that out there
in the world of math, there has to be some way to analyse it.

Well, as a warmup let's take a look at rule 90. [See
A New Kind of Science, page 26.]

It turns out this rule just adds neighbors mod 2.

So it makes Pascal's triangle mod 2. [See
A New Kind of
Science, page
611.]

And that means you can find the color at position x and step t from a binomial
coefficient. Or actually also just by taking bitwise ORs of the digits in the
integers x and t. [See
A New Kind of
Science, page
608.]

Well, what happens if we don't have a fixed array of cells? Here's a
sequential substitution system, that just iteratively rewrites strings.
[See A New Kind of Science, page
90.]

And it takes only simple rules to get very complex behavior. [See A New Kind of
Science, page 92.]

Here's a rather different type of system: a register machine. [See
A New Kind of Science, page
100.]

It's like a minimal idealization of machine language. And this is the smallest
register machine with complex behavior.

Here are what I call symbolic systems. [See
A New Kind of Science, page
102.]

Kind of minimal idealizations of the underlying structure of Mathematica. Or
generalizations of combinators. And again, it's the same story. [See
A New Kind of Science, page
104.]

One thing about all the systems I've showed so far is that they all
just
evolve sequentially from one step to the next. Well, here's one that
doesn't. [See A New Kind of Science, page
205.]

I call it a multiway system. It has rules for rewriting strings, that get
applied in all possible ways. It's kind of like a semi-semi-group. And again,
even simple rules give all sorts of complex behavior. [See
A New Kind of Science, page
206.]

One doesn't actually have to have any kind of process of evolution at all. Like
consider a tiling problem, where one has the constraint that around every cell,
only certain neighborhoods are allowed. [See
A New Kind of Science, page
213.]

Well, with exactly this setup, it turns out that if one can tile the plane at
all, then one can always do it periodically in one of these 171 ways.
[See A New Kind of Science, page
214.]

But if one also says that the plane can't be all white, then about the 19
millionth constraint gives this. [See
A New Kind of Science, page
219.]

It's the first necessarily nonperiodic tiling. Kind of the minimal square
quasicrystal. Or an interesting two-dimensional subshift of finite type.

Well, going a bit further one sees cases like this. [See
A New Kind of Science, page
221.]

A tiling forced to make a rule 30 pattern. Which would be pretty interesting to
find in a real crystal.

Well, OK, what about numbers? Here are the digit sequences one gets by
successively incrementing an integer. [See
A New Kind of Science, page
118.]

A simple nested pattern.

Here are powers of two in base two. [See
A New Kind of Science, page
119.]

Quite trivial. But here are the powers of 3 in base 2. [See
A New Kind of Science, page
119.]

Kind of like a minimal linear congruential pseudorandom generator. And already
remarkably complicated—very much like a cellular automaton. And actually, in
cases like these, it turns out to be exactly a cellular automaton.
[See A New Kind of Science, page
614.]

Here's another example with digits. [See
A New Kind of Science, page
126.]

Iterating addition and digit reversal.

Well, OK, what if one deals with integers as a whole, not digit sequences.
Let's say one defines a recursive function. Well, again there are remarkably
simple ones that give really complex behavior. [See A New Kind of
Science, page
130.]

Primitive recursive functions are particularly easy to enumerate. And then I
can tell you that this is the very simplest one that gives complex
behavior. [See A New Kind of Science, page
908.]

Last summer we had an NKS
summer school, and did a class project on recursive
functions. Where we found some really neat things. Like here's the very simplest
nestedly recursive function with complex behavior.

OK, what about real numbers? Here's a continuous cellular automaton, with
values from 0 to 1 at each cell. [See
A New Kind of Science, page
157.]

Same story.

But what about PDEs? Well, most of the PDEs people normally study have rather
simple behavior, at least with simple initial conditions. [See
A New Kind of Science, page
163.]

Like here's the diffusion equation. And the wave equation. And the sine-Gordon
equation.

But a few years ago I decided just to do a systematic search of possible
PDEs. Many of them have pathologies. But after a little while I found
these. [See A New Kind of Science, page
165.]

They're just simple nonlinear wave equations. But even starting from a little
Gaussian they give all this stuff. That looks a bit like rule 30.

Of course, it's hard to know exactly what these PDEs do. We've been using
these as test cases for numerical PDE solving in Mathematica for ages. But
without already more or less knowing the answer, it's hard for numerical
analysis to be sure things are real.

But with the likes of rule 30 it's a different story. [See
A New Kind of Science, page
30.]

Because the bits are the bits. And so there's no question one's discovered
something real.

And actually it's so straightforward to get, grade schoolers could do it. And
I sometimes wonder whether, a bit like these... [See
A New Kind of Science, page
43.]

...some Babylonian mosaic of rule 30 might one day be unearthed.

But I doubt it. Because I think if rule 30 had been known in antiquity,
science and math would have developed rather differently. Of course, now that
one's seen the rule 30 phenomenon, one can go back and find hints of it from really
long ago.

Like the primes, for example. The Greeks knew an easy rule for making
them. [See A New Kind of Science, page
122.]

But the pattern of primes that comes out is irregular and in some ways quite
random.

But somehow they were always treated as curiosities. That didn't fit
together, and didn't really seem to add up to much.

Well, here's the critical fact: they're all signs of a very general
phenomenon. That's deep and important. And there's a whole new conceptual
framework that can be built around it—a whole new kind of science.
[See A New Kind of
Science, page
30.]

It changes a lot of things. Like the way we approach modelling.

I mean, for three hundred years there's been this idea that if one wants to
make a really good model of something, it should be based on equations. And that
certainly worked well for Newton and friends in figuring out orbits of comets.
And for lots and lots of things since then. But when we see behavior that really
looks complex, it hasn't worked very well.

Now, if one's going to be able to do theoretical science, the systems one
studies had better follow definite rules. But why should those rules be based on
the constructs we've studied in mathematics: on numbers and calculus and so on?
Can't they be more general?

In the past one wouldn't have had much of a way to think about more general
rules. But now we have computers, with programs that in effect implement
arbitrarily general rules. Of course, the programs we use in practice are long
and complicated, set up for particular purposes.

But here's a basic science question that's at the core of the new kind of
science I've been building. If one just enumerates simple programs, what do they
do?

In computer science we're used to studying programs constructed for
particular purposes. But what if we just go out and explore the computational
universe?

It's exciting. It's kind of like 400 years ago when telescopes were first
invented. One could just point one's telescope somewhere in the sky and see
what's out there. Well, now we get to do that in the computational universe.

And actually, being able to do that is one of the main reasons I built
Mathematica. People of course use Mathematica a lot to do traditional math, and
we're proud of that. But at its core Mathematica is a lot more general. General
enough to really let one explore the computational universe.

And it's
incredible what happens when one does that. First, one sees all this strange and
wonderful flora and fauna. But then—and it took me years—one gradually
builds new principles, and new intuition.

Our normal intuition tends to be that it's complicated to make
something complicated. But that's just not true in the general computational
universe. Look at rule 30. But in doing things like engineering we usually have
the constraint that we have to foresee what the things we build are going to do.
And so we've been forced to use only rules that have simple, foreseeable,
behavior.

But the point is that nature is under no such constraint. So there's nothing
wrong with it using rules like rule 30. And I think that's the secret of how it
makes complexity.

Well, OK, so how can we use what we learn from simple programs? We can make
technology. New kinds of devices. Or efficient algorithms that don't just have
iteration, or recursion, but rule 30 type behavior.

We can also make new kinds of models of things. In the past, if we saw this...

...we'd probably assume its rules were really complicated. But now we might just
search, and find that this is all that's going on.

So, OK, so let's say we have a system in nature. Like a snowflake.
[See A New Kind of Science, page
370.]

How do we make a good model of it? Not just a picture, but a model of how it
works. Well, snowflakes grow by adding pieces of ice to their surface. And we
can make a simple hexagonal cellular automaton that does that. [See
A New Kind of Science, page
369.]

Not much like a snowflake. But let's add the next important effect: growth
inhibition from latent heat. Let's try the simplest cellular automaton rule that
captures that. Here's what we get. [See
A New Kind of Science, page
371.]

Remarkably like real snowflakes. It really seems like we've captured what makes
snowflakes have the shapes they do. And we get predictions, like that big
snowflakes have little holes in them. Which indeed they do.

OK, but there are obviously details of snowflakes that we haven't captured.
But that's always going to happen. Because the whole point of a model is to
capture certain features, and idealize everything else away. And depending on
what one's interested in, one may pick different features. And so the cellular
automaton model is good, for example, if one's interested in finding a snowflake
shape distribution. But it's not so useful if one wants to calculate how fast
each arm grows at a certain temperature.

There are certainly traditional PDEs for snowflakes. But they're complicated
and hard to solve. And the point is that for many questions, a cellular
automaton model picks out much more directly what one really needs to know.

And it also lets one generalize, for example enumerating other shapes that
might grow. ([See A New Kind of Science, page
373.]

OK. Let's take another example. Fluid turbulence. [See
A New Kind of Science, page
377.]

Where does all that randomness in a turbulent flow come from?

Well, one can ask that about randomness in any system. And I think there are
really three ways randomness can arise.

First, it can be continually injected from the environment. Like with a boat
bobbing on an ocean. Where there's random motion because of the randomness of
the ocean.

Usually we'd build a stochastic or probabilistic model for that, with
randomness built right in. [See
A New Kind of Science, page
591.]

But in recent decades another source of randomness that's been talked about
is chaos theory. Where the randomness isn't continually injected, but instead
comes from the details of the initial conditions for the system. Like when one
tosses a coin or spins a wheel. Where the final outcome depends sensitively on
the initial speed. [See A New Kind of
Science, page
305.]

Well, there are stronger version of this, where one's effectively taking
numbers that represent initial conditions, and continually excavating higher and
higher order digits from them. Like in the shift map,
x[t+1] = FractionalPart[2 x[t]]. [See A New Kind of Science, page
308.]

Each step shifts every digit one position
to the left. So that eventually even the most insignificant digit will have a
big effect.

Well, so why does this make randomness? It makes randomness if the initial
digit sequence is random. And with the usual view of real numbers, one images
that a typical number picked as an initial condition would almost always have
that kind of randomness.

But if one thinks in terms of information, this isn't really unsatisfactory.
It's like saying that one gets randomness out because one put randomness in. And
that the randomness is just coming from outside the system we're looking at.

Well, is there any other possibility? Yes. Just look at rule 30. There's no
randomness from the environment—it's a deterministic rule. And there's no
randomness in the initial conditions—they're just a single black cell. But
still, the actual evolution of the system intrinsically generates apparent
randomness. [See A New Kind of Science, page
30.]

It's a bit tricky at first to analyse this. Not least because systems often
have both sensitive dependence on initial conditions, and intrinsic randomness
generation. Rule 30 has sensitive dependence; the slopes of these difference
patterns give its Lyapunov exponents. [See A New Kind of Science, page 251.]

And iterated maps can show intrinsic randomness generation. [See
A New Kind of Science, page
150.]

But that's not what dynamical systems theory picks up. Because it's talking
about all possible trajectories. Not just trajectories that have "simple to
make" initial conditions.

But if you really want to explain randomness, you need to get more precise.

Well, let's go back to fluid turbulence. [See A New Kind of
Science, page
377.]

Where does its randomness come from? The Navier–Stokes equations haven't told
us. It could even be that we need to add Brownian motion terms. We know there
are some instabilities, but we don't really know their large-scale effects. And
in numerical experiments, it's hard to distinguish numerical glitches from
genuine randomness.

Well, a different approach to modelling helps with that. At the lowest level a
fluid just consists of a bunch of molecules. And we know that the details aren't
too important—we always get the same continuum behavior. So we can try just
making a minimal model for the molecules. Put them on a discrete grid, with
discrete velocities, and so on. [See A New Kind of
Science, page
378.]

Well, in the appropriate limit, that gives Navier–Stokes, and one can actually
do nice practical fluid dynamics. [See A New Kind of
Science, page
380.]

But one can also investigate some basic questions. And one finds that one can
indeed get large-scale apparent randomness, without any random input. Which
means there's intrinsic randomness generation going on. [See A New Kind of
Science, page
30.]

One can test that in physical experiments. Because if randomness comes from
the environment, or from initial conditions, it'll always be different every
time one runs an experiment. But with intrinsic randomness generation,
sufficiently carefully controlled experiments will give repeatable randomness,
which one can look for. You know, probabilistic models are very common. But I
think a lot of them really aren't necessary. And that a lot of systems can just
make their own randomness, without getting it from outside.

Like in an Ising model. Where one normally assumes that spins are kicked
around by a heat bath. Well, there's a simple deterministic model that seems to
have the same average behavior—and that's probably much closer to an actual
spin system. [See A New Kind of Science, page
982.]

It seems like there are a lot of probabilistic models where there are
deterministic systems that have the same behavior. Like here's a deterministic
2D cellular automaton on a square grid that almost—though not quite—makes
enough randomness to give a circle like a random walk. [See A New Kind of
Science, page
178.]

By the way, I should say something about what I mean by randomness. In everyday
speech, we usually call things random if we can't find a compressed description
of them. Well, the ultimate is algorithmic randomness, where there just is no
compressed description: there's no program shorter than a sequence that will
reproduce it. It's an interesting definition. But it probably says that nothing
in our universe can be considered random. Certainly not a rule 30 sequence.
Because after all, it's generated by a very small program.

But the point is that we can't find that program. Statistical analysis, data
compression, visual perception, cryptanalysis—none of them can crack
rule 30. [See A New Kind of Science, page
561.]

So that's why its sequences seem random. And one can make that more precise by
talking about analysis with all possible simple programs.

This turns out to be related to the Second Law of Thermodynamics: the law of
entropy increase. It's easy to make a system like a cellular automaton
microscopically reversible—so it's bijective, just like the laws of physics
seem to be. [See A New Kind of Science, page
437.]

But the point is that intrinsic randomness generation in effect encrypts even
simple initial conditions to the point where they seem typical of the
ensemble—at least with respect to any reasonably simple computation. And from
this one seems finally to get at least the outline of a real proof of the Second
Law. As well as seeing potential exceptions. [See A New Kind of
Science, page
440.]

OK, well that's a little about physics. What about biology? Where does all
the complexity in biology really come from? [See A New Kind of
Science, page
385.]

It's never been clear why natural selection should give it. And in fact, I
think it's much more just a consequence of properties of simple programs. That
if one picks a genetic program at random, it'll often show complex behavior.

Let me give you an example. Here's a mollusc shell. [See
A New Kind of Science, page
423.]

Which we might have thought must be the result of some sophisticated
optimization process. But actually its pattern is incredibly similar to what we
get just from rule 30. [See A New Kind of Science, page
30.]

In the actual mollusc there's a line of cells that lay down pigment as the
shell grows. And we can model that as a cellular automaton. [See
A New Kind of Science, page
414.]

Well, let's say we just consider all possible rules of the simplest kind.
Here are all the patterns they make. [See
A New Kind of
Science, page
424.]

And here's a remarkable fact. The classes of patterns we get seem to be just
the same as in actual molluscs. [See A New Kind of
Science, page
423.]

It's as if the mollusc species of the Earth just sample different possible
programs, and then we see the results displayed on their shells.

You know, it usually seems hopeless to make a predictive theory of biology.
Because everything seems to depend on historical accidents.

But if different species just uniformly sample simple programs, we get to
make predictions just by knowing abstract properties of programs. Let me give
you another example. Shapes of leaves. [See A New Kind of
Science, page
403.]

They're pretty diverse. But it turns out that there's a simple program that
seems to capture most of them. It's just based on successive repeated
branchings. [See A New Kind of Science, page
400.]

That fill in leaf material. And get shapes that look just like actual leaves—sometimes smooth, sometimes jagged, and so on. [See A New Kind of Science, page
402.]

Well, what happens with all possible such models? Here's a simple case where
it's easy to lay out the possible shapes. [See A New Kind of Science, page
406.]

And one can summarize these in a parameter space set. [See
A New Kind of Science, page
407.]

From whose properties one can deduce all sorts of things about leaves and their
likely evolution.

It's actually a mathematically rather interesting set. With properties a bit
like the Mandelbrot set. By the way, for people who know about such things, my
set is formed by looking at whether each tree reaches a given point. If you look
instead at connectedness, you get a different set that's much less interesting
set. Well, there are lots of applications in biology. Mixing traditional math—like differential geometry—with simple programs gives lots of results on how
things grow. [See A
New Kind of Science, page
418.]

And in molecular biology there's much to say about using primitives based on
simple programs to describe biological processes.

But let me talk a little more about physics. And in particular fundamental
physics.

Well, now that we've seen the things simple programs can do, we might ask the
ultimate question: underneath all of physics could there just be a simple
program? A little program that if run for long enough would reproduce in precise
detail the whole history of our universe. In a sense the ultimate reduction of
physics to an abstract system.

Well, so what might the program for the universe be like? One thing that's
kind of inevitable is that very few familiar features of the universe will
immediately be visible in it. There just isn't room. I mean, in a small program
there's no way to fit in separate identifiable pieces that represent electrons,
or gravity, or even space and time.

In fact, if the program's going to be really small, it sort of has to have
the very least possible structure built in. And for example I think a cellular
automaton already has far too much structure. It's got a whole rigid array of
cells laid out in space. And it also separates the notion of space from the
notion of states of cells. And I don't think one needs even that.

In physics, space is normally a kind of background—on top of which matter
and everything else exists. But in an ultimate model I think space is the only
thing one needs.

So what might space then be? I suspect it's ultimately just a bunch of points
with a certain set of connections. There are various ways to formulate it. But
one is just to say it's a giant trivalent network—a cubic graph.
[See A New Kind of Science, page
1039.]

In which one just specifies the combinatorics of what point is connected to
what.

So how does space as we know it emerge from that? The idea is that with
enough nodes, it approximates a continuum. Like with molecules in a fluid. But
more subtle.

Here are some networks that look like one-, two- and three-dimensional
space. [See A New Kind of Science, page
477.]

But remember, all we've actually specified is how the nodes are connected. So
how do we ultimately tell what dimension a network corresponds to? Just start
from a node. Then look at how many distinct nodes you reach by going r
connections. [See A New Kind of Science, page
477.]

That's essentially the volume of a ball in the network. And if the network is
going to correspond to d-dimensional space, that volume must grow like rd.

Here I've laid out nodes so that their x position is determined by their
minimum distance from a starting point. [See A New Kind of Science, page
479.]

And you can see the different growth rates for different dimensions.

What about curved space? In a flat hexagonal lattice, the number of nodes at
distance r goes exactly like r2. [See
A New Kind of Science, page
532.]

But with curvature there's a correction. That turns out to be proportional to
the Ricci scalar curvature for the limiting manifold. [See A New Kind of
Science, page
1050.]

Well, it's pretty common to go from manifolds to more discrete things. Like
in triangulating a manifold. [See A New Kind of Science, page
533.]

But there one's still giving embedding information: saying how to lay out the
points in an ambient space. In combinatorial topology one gets rid of the
embedding. But one still has data on which nodes bound a face, and so on.

But in my combinatorial networks one doesn't have even that. So it's sort of
remarkable how much limiting differential geometry one can get. Ricci scalars
from growth rates of balls. Ricci tensors from growth rates of cylinders
generated by geodesics. But, significantly, one can't directly get Riemann
tensors. Well, OK, in our universe there's not only space, but also time. And in
ordinary physics, one thinks of time as just another coordinate. But in programs
it tends to be very different. Like in a cellular automaton. Where one moves in
space by going from one cell to another. But one moves in time by actually
applying the cellular automaton rule.

Well, I suspect that's closer to how things work in physics. But it won't be
exactly like a cellular automaton. Because that would need a global clock to
synchronize everything.

OK, so how might it actually work? Well, here's something that at first
sounds kind of crazy. How about if the universe is like a Turing machine—or
what I call a mobile automaton—where only one cell gets updated at each step?
[See A New Kind of Science, page
487.]

There's no problem with synchronization, because only one thing happens at a
time. But how can it possibly be right? I certainly don't have the impression
that for example first I'm getting updated, then you're getting updated, and so
on. But how would I know? Because until I'm updated I can't tell whether you've
been updated or not. And, if one follows this through one realizes that all that
we can ever actually know about is a kind of a causal network of what event
influences what other event.

Here's an example of how one can go from an underlying sequence of updates to
a causal network. And even though each update only affects one cell at a
time, the final causal network—the partially ordered set of events—corresponds to something kind of uniform in spacetime. [See
A New Kind of Science, page
489.]

OK, so how might time work in the space networks I talked about? One could
imagine that there's an updating rule that just says that whenever there's a
piece of network that has a particular form, it should get replaced. Like in
these rules. [See A New Kind of Science, page
509.]

But there's an immediate problem: what if there are several places a
replacement could be done? Different updating orders in general lead to
different causal networks. So one will get a whole tree of possible histories
for the universe. So that to say what actually happens in our universe, one will
somehow have to say which branch one's on.

But there's another possibility. It turns out that with some underlying rules
it actually doesn't matter what order they're applied in: there's what I call
causal invariance, and the causal network is always the same.

This is related to so-called confluence or Church–Rosser properties in
rewrite systems: it also happens to be related to canonical forms in
Mathematica. But anyway, there are conditions about avoiding overlaps, which
turn out for example to allow rules based on graphs like these. [See A New Kind of
Science, page
515.]

So that whenever one uses just graphs like these, one gets causal invariance,
and that means that in a sense there's always just a single thread of time in
the universe.

But it turns out that this also has another, rather remarkable consequence:
it implies that special relativity must hold. It's a slightly complicated story,
but roughly, different updating orders correspond to different spacelike
hypersurfaces. [See A New Kind of Science, page
516.]

And that's why—subject to various tricky issues of limits and averages and
things—causal invariance implies relativistic invariance.

OK, well that's remarkable. But what about general relativity? I mentioned
getting Ricci tensors from space networks. For general relativity, one needs
spacetime Ricci tensors. But one can get those from growth rates of volumes of
cones in causal networks.

And then one discovers something rather remarkable. If there's microscopic
randomness, and if things stay finite dimensional, then in a suitable limit, and
with various other conditions, one can show that the spacetime Ricci tensor must
vanish. Which is equivalent to saying that the Einstein equations hold.

Well, that's rather remarkable. To have started with so little. But to have
been able to derive something sophisticated like that.

But, OK, what about matter, and particles, like electrons and photons?
Remember that all we have in the universe is space. So what are particles? Well,
here's how things can work in something like a cellular automaton.
[See A New Kind of
Science, page
229.]

This starts off randomly. But quickly organizes into a few persistent localized
structures. That act just like particles. Like here are a couple of collisions
between them. [See A New Kind of Science, page
295.]

Two particles come in, and interact, and then a whole bunch of particles come
out. Almost like an explicit version of a Feynman diagram.

Well, OK, so how does something like this work in a network? Here's a very
simple example. Imagine we have rules that always preserve planarity. Then
imagine that we introduce some lumps of nonplanarity. Well, those end up being
conserved until they collide, much like particles. [See A New Kind of Science, page
527.]

And in this simple example we know a lot about how to classify the possible
particle-like objects. Kuratowski's theorem tells us that in our networks single
lumps of nonplanarity must be like K3,3s. But the actual nonplanarity is sort of
hard to localize. A bit like in quantum mechanics.

Well, in more realistic cases it's much more complicated, and there's lots of
unknown graph theory. But there are signs of all sorts of other features of
quantum field theory, and gauge invariance, and so on.

It all looks extremely promising. It really seems like out there in the
computational universe there may be a simple program that's a precise version of
our universe. And if nothing else, the process of trying to find it is bringing
up some pretty interesting issues linking geometry, computation, graph theory,
recursion theory and so on.

Well, OK. I want to come back now to what really launched everything I've
been talking about: the discovery that simple programs—like rule 30—can make
immense complexity. So why does that happen? What's the fundamental
reason? [See A New Kind of Science, page
30.]

Well, to answer that one needs a somewhat new conceptual framework. And the
basis is to think about all processes as computations. The initial conditions
are the input. And the behavior that's generated is the output.

Well, sometimes the computations are ones that we—kind of—immediately
know the point of. Like here's a cellular automaton that computes the square of
any number. [See A New Kind of Science, page
639.]

You give it a block of n cells, and it generates a block of n2 cells. And
here's a cellular automaton that generates the primes. [See
A New Kind of Science, page
640.]

But actually any cellular automaton can be thought of as doing a computation.
It just isn't necessarily a computation that we know the point of beforehand.

OK, so we have all sorts of systems and they do all sorts of computations.
But what general statements can we make? How do all these computations compare?

Well, we might have thought that to do a different kind of computation we'd
always need a different kind of system. But the remarkable idea that's now about
70 years old is that no, that's not true. Instead, it's possible to have a
universal system that can do any computation if fed the right input. And of
course that's been a pretty important idea. Because it's the idea that makes
software possible, and it's the idea that launched the computer revolution.

That with different initial conditions can be made to act like any other
cellular automaton. So here it's behaving like rule 254, which makes a simple
uniform pattern. [See A New Kind of Science, page
646.]

Here's it behaving like rule 90. [See
A New Kind of Science, page
647.]

And here it's behaving like rule 30. And remember, each of these pictures is of
the same universal cellular automaton, with the same underlying rules. But
what's happening is that by giving different initial conditions, we're
effectively programming the universal cellular automaton to emulate other
cellular automata. And in fact it can emulate absolutely any other cellular
automaton, with rules of any size.

One might have thought that a system would only be able to emulate systems that
were somehow simpler than itself. But universality says that there are fixed
systems that can emulate any system, however complicated. So in a sense, once
one's got universality, one's maxed out from a computational point of view—one can do essentially any computation, however sophisticated.

OK, so what systems are like rule 30? Or systems we see in nature? How
sophisticated are the computations that they're doing? [See
A New Kind of Science, page
30.]

Well, I spent a long time thinking about this and accumulating all sorts of
evidence. And I ended up concluding something that at first seems pretty
surprising. I call it the Principle of Computational Equivalence. It's a very
general principle. And in its roughest form what it says is this: that essentially
any time the behavior of a system looks complex, it will correspond to a
computation of exactly equivalent sophistication. If we see repetitive or nested
behavior then it's pretty obvious that it corresponds to a simple
computation. [See A New
Kind of Science, page
58.]

But what the Principle of Computational Equivalence says is that when we don't
see regularities like those, we're almost always seeing a process that's in a
sense maximally computationally sophisticated.

Now at first that's pretty surprising. Because we might have thought that the
sophistication of the computations that get done would depend on the
sophistication of the rules that got put in. But the Principle of Computational
Equivalence says it doesn't. And that immediately gives us a prediction. It says
that even though their rules are extremely simple systems like rule 30 should be
computationally universal. [See A New Kind of Science, page
30.]

Well, normally we'd imagine that to achieve something as sophisticated as
computation universality we'd somehow need sophisticated underlying rules. And
certainly actual computers have CPU chips with millions of components, and so
on. But the Principle of Computational Equivalence says you don't need all of
that. It says that even cellular automata with very simple rules should be
universal.

It's got a very simple rule. But as you can see, it does some fairly complicated
things. It's got lots of little structures running around doing things. But can
one really assemble them to get something one can see is universal? Well, after
a lot of painstaking work, one gets the result: that indeed rule 110 is
universal!

Well, that's just what the Principle of Computational Equivalence said should
be true. But it's really a remarkable thing. Because it means that this little
rule

can in effect produce behavior that's as complex as any system. One doesn't need
anything like a whole computer CPU to do universal computation. One just needs
this little rule.

And that has some very important consequences, for example in thinking about
nature. Because we wouldn't expect to find whole computer CPUs just lying around
in nature, but we definitely can expect to find things like rule 110. And that
means for example that lots of everyday systems are likely to be universal. By
the way, in the past this was the simplest known universal Turing machine.

But actually this much simpler Turing machine is also universal.
[See A New Kind of Science, page
707.]

And I suspect this is the very simplest possible one that's universal.
[See A New Kind of Science, page
709.]

Well, there's a lot to say about what kind of a thing the Principle of
Computational Equivalence is, and what it means. One thing it does is to make
Church's thesis definite by saying that there really is a hard upper limit on
the computations that can be done in our universe. But the place where it really
starts to get teeth is when it says that not only is there an upper limit—but
that limit is actually reached most of the time. The very simplest rules will
just give simple behavior—say repetitive or nested. But what the Principle of
Computational Equivalence says is that if one looks at just a few more rules,
one will suddenly cross a threshold—and end up with maximal computational
sophistication.

Actually the principle goes even further than that. Normally with universal
computation one imagines being able to set up whatever initial conditions one
wants. But the Principle of Computational Equivalence says that that's not
necessary. It says that even when the initial conditions are simple, there'll
still usually be maximally sophisticated computation going on. OK, so what does
this all mean? Well, first it gives us a way to answer the original question of
how something like rule 30 manages to seem so complex. [See A New Kind of Science, page 30.]

The point is that there's always a competition between an observer and a system
they're observing. And if the observer is somehow computationally more
sophisticated than the system, then they can in a sense decode what the system
is doing—so it'll look simple to them. But what the Principle of Computational
Equivalence says is that the observer will usually be exactly computationally
equivalent to the system they're observing. So inevitably the behavior of the
system will seem to them complex.

Well, a related consequence of the Principle of Computational Equivalence is
a phenomenon I call computational irreducibility.

Let's say you know the rules and the initial conditions for a system. Well,
then you can certainly work out what the system will do just by explicitly
running it. But the question is whether you can somehow shortcut that process.
Can you for example just work out a formula for what will happen?

That kind of computational reducibility is at the core of most traditional
theoretical science. If you want to work out where an Earth in an elliptical
orbit will be a million years from now, you don't have to trace a million
orbits; you just have to plug a number into a formula. But the problem is, what
happens when the behavior is more complex? It's easy to shortcut repetitive
behavior, and fairly easy to shortcut nested behavior.

But what about a case like this? There's certainly no obvious way to shortcut
this, and in fact I think it's computationally irreducible: there's essentially
no way to work out what the system will do by any procedure that takes less
computational effort than just running the system and seeing what happens.
[See A New Kind of
Science, page
740.]

In traditional theoretical science, there's sort of been an idealization made
that the observer is infinitely computationally powerful relative to the system
they're observing. But the point is that when there's complex behavior, the
Principle of Computational Equivalence says that instead the system is just as
computationally sophisticated as the observer. And that's what leads to
computational irreducibility. And that's in a sense why traditional theoretical
science hasn't been able to make more progress when one sees complexity. There
are always pockets of reducibility—like the kinds that can be represented by
all those special functions of mathematical physics on our Functions Site—but
there's always a core of computational irreducibility.

Well, I think computational irreducibility is a pretty important phenomenon.
With some pretty interesting philosophical as well as scientific implications.
It's also in a sense what's inside the phenomenon of undecidability. Look at
this cellular automaton. [See A New Kind of Science, page
754.]

And ask the question: starting from a given initial condition, will the pattern
that's produced eventually die out, or will it just keep going forever? In the
first case, running the cellular automaton tells one that after 36 steps it dies
out. In the next case, though, it takes 1017 steps to find that out. And in the
next two cases, even after 10 million steps it's still not clear what's going to
happen.

And the point is that if there's computational irreducibility, there's no way
to shortcut this evolution—so no finite computation can always figure out what
will happen after an infinite time. And that means one has to say that it's in
general formally undecidable.

Well, undecidability has been known about in mathematics and in computer
science since the 1930s. But what we're seeing is that it's also relevant to
natural science.

If one asks about infinite time or infinite size limits the answers can be
undecidable. Like whether a body will ever escape in a three-body problem.
[See A New Kind of
Science, page
973.]

Or whether there's a way to arrange some complicated molecule into a crystal
below a certain temperature. Or whether some idealized biological cell line will
grow forever or will eventually die out.

Well, there's another big place where I think computational irreducibility is
very important—and that's in the foundations of mathematics. It may sound kind
of obvious, but it's really a deep fact that mathematics is often hard to do.
Yet it's based on pretty simple axioms. In fact, right here are ones for
essentially all of current mathematics—from logic, to algebra, geometry, and
set theory. [See A New
Kind of Science, page
773.]

But even though these axioms are all quite simple, our proofs of things like the
Four-Color Theorem or Fermat's Last Theorem are really long. And one can think
of that as another example of computational irreducibility.

Here's an example of a simple proof. At the top are some axioms—in this
case specifying equivalences in logic. And then here's a proof. It starts from
one expression, then keeps on using the axioms, and eventually proves the
theorem that the expression at the top is equivalent to the one at the
bottom. [See A New Kind of Science, page
775.]

Well, OK, just like we made minimal models of natural systems like
snowflakes, we can also think about doing that for math. So as a kind of minimal
idealization of math, one can imagine that the axioms just define
transformations between strings. So with the axioms at the bottom here, here are
proofs of a few theorems. [See A New Kind of Science, page
776.]

But how long do the proofs need to be? Well, here's a picture for three axiom
systems, showing the network of all possible transformations. And the way this
works, each possible path through the network corresponds to the proof of some
theorem. [See A New Kind of Science, page
778.]

Well, the point is that the shortest path from one particular string to
another may be really long. And that means that the theorem that the strings are
equivalent has only a really long proof.

Well, when people were thinking about formalizing mathematics a century ago,
they kind of assumed that in any reasonable axiom system it'd always eventually
be possible to give a proof of whether a particular statement was true or false.
So it was a big shock in 1931 when Gödel's theorem showed that that wasn't true
for Peano arithmetic.

What Gödel actually did, though, was to look at the statement "this statement
is unprovable". Well, just by what it says, the statement obviously can't
consistently be proved true or false. But as such it doesn't seem like a
statement in arithmetic. And Gödel's real achievement was essentially to show
that arithmetic is universal, so that in particular it can encode his statement.

Well, for the foundations of math, Gödel's theorem was a pretty big deal. But
somehow in all these years it's never seemed too relevant to most of the things
working mathematicians deal with. And if you really needed something as funky as
Gödel's statement to get undecidability that wouldn't be surprising.

But here's the thing: the Principle of Computational Equivalence should also
apply to systems in mathematics. And it then says that computational
irreducibility—and undecidability—should actually not be rare at all.

So where are all these undecidable statements in mathematics? Well, one knows
that there are Diophantine equations about which there are undecidable
statements. Here's an explicit example—a Diophantine equation set up to
emulate rule 110. [See A New Kind of Science, page
786.]

Well, this is obviously pretty complicated, and not something that would show
up everyday. But what about simpler Diophantine equations? Here are a bunch.
Well, linear Diophantine equations were cracked in antiquity. Quadratic ones two
centuries ago. And so far another kind seems to be cracked roughly every fifty
or a hundred years. But I'm guessing that that's not going to go on. And that
actually many of the currently unsolved problems in number theory will turn out
to be undecidable. [See A New Kind of Science, page
790.]

OK, but why has so much math successfully been done without running into
undecidability? I think it's kind of like theoretical physics: it's tended to
stick to places where there's computational reducibility, and where its methods
can make progress.

But at least in recent times mathematics has sometimes prided itself on
somehow being as general as possible. So then why haven't rule 30 and rule 110
and all the phenomena I've found in simple programs showed up?

Well, I think part of the reason is that mathematics isn't really quite as
general as one might think. As one way to see what it could be, one can just
imagine enumerating possible axiom systems. And for example this shows what
theorems are true for a sequence of different axiom systems. It's like an
ultimately desiccated form of mathematics: the axioms go down the left, the
theorems go across the top—and every time there's a theorem that's true for a
particular axiom system there's a black dot. [See A New Kind of Science, page
812.]

So is there something special about the actual axiom systems that get used in
mathematics? Perhaps something that makes undecidability less rampant? Well, if
one looks at axiom systems from textbooks, they're usually pretty complicated.
Here's propositional logic, for instance. [See A New Kind of Science, page
773.]

Well, actually, it's been known for a hundred years that one doesn't have to
have all those operators in there. The single NAND operator—or Sheffer
stroke—is enough. But the obvious axiom system with that is still pretty
complicated. [See A New Kind of Science, page
773.]

Well, from my intuition from the world of simple programs, I suspected that
there should actually be a really simple axiom system for logic—probably with
just one axiom. And so I just did a systematic search, and lo and behold, I
found one. [See A New Kind of Science, page
773.]

And I can tell you that this is the very simplest possible axiom system for
logic. Well, knowing this I can say that if one just enumerates axiom systems,
logic will be about the 50,000th one that one reaches. [See
A New Kind of Science, page
773.]

But what about all the other ones? Well, most of them are perfectly
reasonable axiom systems too. They just don't happen to be known fields of
mathematics.

But why not? Well, actually what I think is that mathematics has been in a
sense tremendously constrained in the way it's developed. And in fact at some
level it's really still just about various direct generalizations of the
arithmetic and geometry that got studied in ancient Babylon. It's explored only
a tiny part of what's out there in the general computational world.

But why really is that? I think a lot of it is the way generalization has
been done in mathematics. Usually one has some object, that satisfies certain
theorems. And then one generalizes by looking for a larger class of objects that
satisfy the theorems.

Well, this is a good strategy if one wants to be sure one can still prove
theorems. And probably it's why undecidability hasn't been more common in math
research. But it means that mathematics hasn't particularly sampled what's out
there in the general world of abstract systems.

In natural science we're used to just doing experiments to find out what's
there. But in mathematics, we haven't been. So we just haven't known about all
this stuff.

But OK, is the stuff actually "interesting"? Well, it seems important for
understanding nature. And for making technology.

But what about for math? I mean, if we're just handed an axiom system, what
would make it "math-like"?

Maybe it's like defining life. It's easy to tell living systems on Earth,
because they all have common historical details. But abstract definitions—from
thermodynamics, self-reproduction, etc.—don't really seem to work.

So what about math? Is there a definition that's more than historical? That's
one of those foundational questions with very practical consequences. Because in
Mathematica
we'd really like to make a sort of intermediate layer between the
decision procedures for definite mathematical theories (yes, that's new Version
5 stuff), and the general symbolic language underneath.

Well, so if we look at existing math axiom systems, can we tell what's special
about them? There doesn't seem to be anything in the distribution of theorems
they make true. Or in the lengths of proofs one finds in them.

One might think that what's special is that the axioms are actually "about
things". That somehow the constraints they define force one to be talking about
some definite kind of object. But actually there are often a lot of
"models" that are consistent with a given axiom system. So there's nothing
special there. [See A New
Kind of Science, page
796.]

In an existing field of mathematics it's interesting to do a sort of
empirical metamathematics. Looking at the dependency network of known theorems.
Like here's the one for Euclid's Elements. The longest proof is 30 steps—for
the theorem that there are five Platonic solids. [See A New Kind of Science, page
1176.]

Well, one can ask many global questions about theorem networks. Like
geometrical ones. And one can also try to find precise graph properties that
capture our general notions of powerful theorems, or surprising theorems—or
interesting theorems.

Here are the first few hundreds theorems of propositional logic, written out
in a straightforward order. [See A New Kind of Science, page
817.]

The ones that are highlighted are the ones considered interesting enough to be
given names in typical textbooks. Well, here's a remarkable fact. You can
predict which ones those will be. It turns out they are exactly those theorems
that can't be deduced from ones earlier in the list. They're the first ones that
in a sense give new pieces of information.

Well, so that's a criterion for interestingness. And one can apply it to any
axiom system at all. And sort of automatically begin to build up traditional
math-like structure. But without the history.

Well, OK, so what's going to happen with all the stuff I've talked about here
today? The big book I spent a decade writing has lots and lots of detailed
results. That'll certainly take quite a while to absorb.

But I think three big things will emerge. First, a new kind of basic
science. Like a physics, or a chemistry. Or even a mathematics. But concerned
with what's out there in the computational universe. Second, a whole bunch of
applications. To science. To technology. To other areas. And third, a new way of
thinking. That's relevant for many things. Including for example education.

You know, it's been exciting in the year and a half since my book came out to
watch some of this start to unfold. It's been tremendous to see how many
talented people want to get involved.

It's a challenge to set up the infrastructure for it all. We've been building
a growing website.

We had a very successful NKS 2003 conference. And we're having an NKS 2004
conference in Boston in April. We had a summer school last summer. And we're
having another one this year.

There's a lot to do. And I've started writing down some of the obvious open
problems and projects. It's averaging more than one per page of the book. So
there'll be more than a thousand in the end.

Really a lot is happening. Lots of papers being published. Particularly in
what one might call Applied NKS.

In Pure NKS, a big initiative is the atlas we're building. A terabyte or so
of data about what's out there in the computational universe. That one can
consult and search to find models, or algorithms, or for example raw material
for math. Like here's a trip into a tiny corner of the latest development
version of the Atlas. We could
look at the live version later. We're looking at
some properties of rule 30. Simple initial conditions.

Single black cell.

State transition graphs.

And a table of the one for all rules.

Can you explain those pictures? In this particular corner, there's some nice
combinatorics, and group theory, involved. And there's really a lot that can be
done in pure NKS with existing mathematical methods. And we're counting on
mathematicians to get involved.

One of the exciting things about NKS is how broad the base of people is who
can get involved. If they understand the methodology of doing good computer
experiments, it's perfectly realistic for a high school student to discover
something new and interesting. But there are also plenty of things that reach
the frontiers of math.

I wrote my book to be very accessible. Though with lots of definite stuff,
particularly in the notes. And I think that worked out well. Because it's let a
tremendous range of people read it. And begin to get involved with it.

And the result right now feels a little like an intellectual analog of a new
industry starting up. One of those rare exciting times when a big new direction
is just getting defined.

To cover what's in the book in a serious way is probably a several-semester
undertaking. So I've only been able to scratch the surface and sketch a few
things here today. But I hope I've been able to give you at least some idea of
what's in that big book of mine. And what at least I've been so excited about
all these years. Thank you.