Rating and Stats

Document Actions

Share or Embed Document

699

# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 699
Title: Dynamics Complex Systems
Shor t / Normal / Long
8
Huma n Ci vi li za t i on I :
De f i ni ng Comple xi t y
Conce pt ua l Out li ne
Our ultimate objective is to consider the relationship of a human being to
human civilization, where human civilization is considered as a complex system. We
use this problem to motivate our study of the definition of complexity.
The mathematical definition of the complexity of character strings follows
from information theory. This theory is generalized by algorithmic complexity to allow
all possible algorithms that can compress the strings. The complexity of a string is de-
fined as the length of the shortest binary input to a universal Turing machine, such
that the output is the string.
The use of mappings from strings onto system states allows us to apply the
concepts of algorithmic complexity to physical systems. However, the complexity of
describing a microstate of the system is not really what we mean by system com-
plexity. We define and study the complexity profile, which is the complexity of a sys-
tem observed with a certain precision in space and time.
We est imate the complexity of vari ous systems, focusing on the complexity
of a human bei ng. Our final esti mate is based upon a combinat ion of the length of de-
scripti ons in human language, genetic information in DNA, and component counti ng.
Mot i va t i on
8 . 1 . 1 Huma n civiliza t ion a s a complex syst em
The subject of this and the next chapter is human civilization—the collect ion of all
human beings on earth. Our long-t er m objective is to und erstand whether and ho w
we can treat human civilization as a complex system and,more particular ly, as a com-
plex organism. In biology, collections of inter act ing biological organisms acting t o-
gether are called superorganisms. At times, we will adopt this convention and refer to
civilization as the human superorganism. Much of what we discuss is in early stages
of development and is designed to promote fur ther research.
8 . 1
 8 . 4 
 8 . 3 
 8 . 2 
 8 . 1 
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 699
This subject is distinct from the others we have considered. The primary distinc-
tion is that we have only one example of human civilization. This is not t rue about the
systems we have discussed in earlier chapters, with the except ion of evolution consid-
ered globally. The uniqueness of the human superorganism p resents us with ques-
tions of fundamental interest in science, related to how much we can know about an
individual system. When there are many instances, we can use infor mation provided
by various examples and the statistics of their proper ties. When ther e is only one sys-
tem, to understand its properties or predict its behavior we must apply fundamental
principles that are valid for all complex systems. Since the field of complex systems is
dedicated to uncovering such pr inciples, the subject of the human superorganism
should be considered a premiere area for application of complex syst ems resear ch.
Centr al questions are: How can we characterize this complex system? How can we de-
ter mine its proper ties? What can we tell about its dynamics—its past and future? We
note that as individuals we are elements of the human superorganism, thus our spa-
tial and temporal experience may ver y well be more limited than that appropriate for
analyzing the human superorganism.
The study o f human civilization is guid ed by hist orical r ecords and contempo-
r ary news. In contrast to protein folding, neural networks, evolution and develop-
mental biology there are few reproducible labor ator y exper iments. Because of the ir-
reproducibility of histor ical or contemporary events,these sources of infor mation are
properly not considered part of conventional science. While this can be a limitation,
it is also apparent that there is a large amount of information available.Our task is to
develop systematic methods for consider ing this kind of infor mation that will enable
us to approach questions about the nature of human civilization as a complex system.
Various asp ects o f these problems have been studied by historians, anthropologists
and sociologists.
Why consider human ci vilization as a single complex syst em? The r ecently dis-
cussed concept of a global economy, and earlier the concept of a global village, sug-
gest that we should consider the collective economic behavior of human beings and
possibly the global social behavior as a single system. Consider ing civilization as a sin-
gle entity we are mot ivated to ask various questions about it. These questions relate to
all of the t opics we have covered in the earlier chapters: spatial and t empor al st r uc-
ture, evolut ion and development. We would also like to understand the interaction of
human civilization with its environment.
In developing an understanding of human civilization, we recognize that a
widespread view o f human civilization as a single entity is relatively new and dr iven
by contemporary developments. At least super ficially, the hist or ical epoch described
by the dominance of nation-states appears to be quite different from the present
global economy. While recent events appear to be of particular significance to the
global view, our questions must be addressed in a historical context. Thus we should
include a discussion of the tr ansition to a global economy. We postpone this histori-
cal discussion to the next chapter b ecause of the groundwork that we would like to
build in order to target a par ticular o bjective f or our analysis—that o f complexit y
classificat ion.
700 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 700
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 700
We are motivated to under stand complexity in the context of our effor t to un-
derstand the nature of the human sup erorganism, or the nature of the global econ-
omy. We would like to identify the type of complex system it is—to classify it. The first
distinction that we might make is between a complex material or a complex organism
(see Section 1.3.6). Could part of the global system be modified without affecting the
whole? From historical evidence discussed in the next chapter, the answer appears to
be no. This indicates that human civilization is a complex organism. The next ques-
tion we would like to ask is: What kind of complex organism is it? By analogy we could
ask: Is it like a protein, a cell, a plant, an insect, a frog, a human b eing? What do we
mean by using such analogies? At least in part the problem is to describe the com-
plexity of an entity’s behavior. Intuitively an insect is a simpler organism than a hu-
man being, and this is of qualitative impor tance for our und erstanding of their dif-
ferences. The degree of complexity should provide a scale that can distinguish
between the many different complex systems we are familiar with.
Our objective in this chapter is to d evelop a quantitat ive definition of complex-
ity and behavior al complexit y. We then apply the d efinition to various complex sys-
tems. The focus will be on the complexity o f an indi vidual human being. Once we
have established our complexity scale we will be in a position to apply it to human civ-
ilization. We will und erstand formally why a collect ion of complex systems (human
beings) may be, but need not be, complex. Beyond recognizing human civilization as
a complex system,it is far more significant to identify the degree of its complexity. In
the following brief sect ions we establish some additional context for the importance
of measuring complexity using both unconventional and conventional examples of
organisms whose complexity should be evaluated.
8 . 1 . 2 Scena rio: a lien encount er
The possibility of encountering alien life has been debated within the scientific com-
munit y. In popular literature, such encounters have been por tr ayed in various forms
ranging from benevolent to catastrophic. The scientific debate has focused thus far on
topics such as the statistics of planet for mation and the likelihood that planets con-
tain life. The presence of organic molecules in meteorites and int erstellar gasses has
been interpreted as suggesting that alien life is likely to exist.Effor ts have been made
to listen for signs of alien life in r adio communications and to transmit infor mation
to aliens using the Voyager spacecraft, which is leaving the solar syst em marked with
information about human beings. Thus far there has been no scientifically confirmed
evidence for the existence of alien life. Even a single encounter would change the hu-
man perspect ive on humanit y’s place in the univer se.
Let us consider one possible scenario for an encounter. An object that flashes
light int er mittently is found in or bit around one of the planets of the solar system.
The humans encountering this object are faced with the question of determining
whether the object is: (a) a signal device—specifically a recording, (b) a communica-
tion device, or (c) a living organism. The cent ral problem can be seen to revolve
around determining whether, and in what way, the d evice is responsive to external
phenomena. Do the flashes of light occur without regard to the exter nal environment
M o t i va t i o n 701
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 701
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 701
in a predetermined sequence? Are they random? If the flashes are sensitive to the en-
vironment,then what are they sensit ive to? We will see that these questions are equiv-
alent to the question of determining the complexit y of the object’s behavior.
The concept of life in biology is often defined, or better yet, characterized, in
ter ms of consumption, excret ion and r eproduction. As a definition, these char acter-
istics are well known to be incomplete, since there are life-forms that do not repro-
duce, such as the mule. Further more, a par t icular individual is still considered alive
even if it/he/she does not reproduce. Moreover, there are various physical systems
such as cr ystals and fire that have all these char acter istics in one form or another.
Moreover, there does not appear to be a direct connection between these biological
char acter istics and other characteristics o f life such as sentience and self-awareness.
When consider ing behavior, the biological perspective emphasizes the survival in-
stinct as character istic of life. There are except ions to this,since there exist life-forms
that are at times suicidal, either individually or collectively. The question of whether
an organism actively seeks life or death does not appear to be a character ization of life
but rather o f life-for ms that are likely to sur vive. In our discussions, we may be de-
veloping an additional characterization of life in ter ms of behavioral complexit y.
Definitions of life are oft en considered in sp eculating about the rights of and t reat-
ment of real or imagined organisms—injured or unconscious humans, r obots, or
aliens. The d egr ee o f behavioral complexity is a character ization o f life-forms that
may ultimately play a role in infor ming our ethical decisions with respect to various
biological life-forms, whether t er rest rial or (if found) alien, and ar t ificial life-forms
that we creat e.
8 . 1 . 3 Scena rio: blood cells
One of the areas bri ef ly to u ch ed upon in Ch a pter 6, wh i ch is at the foref ront of com-
p l ex sys tems re s e a rch , is the st u dy of the immune sys tem . Bl ood cell s ,u n l i ke other cell s
in the body, a re mobile on a lengt h scale that is large com p a red to their size . In this
ch a r acteri s tic they are more similar to indepen dent or ganisms than to the ot her cell s
of the body. By their migra ti on t hey might be said to “ch oo s e” to assoc i a te with other
cells of the body, or with forei gn ch emicals and cell s . It is fair to say that our under-
standing of t he beh avi or of i m mune cells remains pri m i tive . In parti c u l a r, the va ri ety
of po s s i ble ch emical interacti ons bet ween cells has on ly begun to be mapped out . Th e s e
i n teracti ons invo lve a va ri ety of ch emical messen gers . More direct cell - to - cell interac-
ti ons wh ere parts of the mem brane or cellular fluid are tra n s ferred are also po s s i bl e .
One of the interesting questions that can be asked is whether, or at what level of
complexity, the inter actions become identifiable as a for m of language. It is not diffi-
cult to imagine, for example, that a chemical communication or iginating fr om one
cell might be transferred through a chain of cell int eractions to a number of other
cells. In the context of the discussion in Section 2.4.5, the question o f existence of a
language might be formulated as a question about the possibility of messages with a
grammar—a combinator ial composition of parts that are categor ized like parts of
speech. Such combinator ial mechanisms are known to exist even at the molecular
level in the DNA coding of antibody r eceptor s that are a composite of different parts
702 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 702
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 702
of the genome. It remains to be seen whether intercellular communication is also gen-
er ated in this fashion.
In the context of this chapter we can reduce the questions about the immune cells
to a single one—What is the degree o f complexity of the b ehavior o f the immune
cells? By its very nature this question can only be answered once a complete under-
standing of immune cell behavior is rea ched. A limit ed und erstanding establishes a
lower bound for the complexity of the behavior. It should also be under stood that dif-
ferent types of cells will most likely have quite different levels of behavioral complex-
it y, just as different animals and man have differ ing levels of complexit y. Our objec-
tive in this chapter is to show that it is possible to quantify the concept of complexity
in a way that is both natural and useful. The pr actical application of these definitions
is a central challenge for the field of complex systems.
8 . 1 . 4 Complexit y
Mathematical definitions of the complexity of systems are based upon the theories of
information and computation discussed in Sections 1.8 and 1.9. In Sect ion 8.2 they
will be used to t reat complexity in the context of mathematical objects such as char-
acter str ings. To develop our understanding of the complexity of physical systems re-
quires that we relate these concepts to those of thermodynamics (Sect ion 1.3) and
various extensions (e.g., Section 1.4) that enable the t reatment of nonequilibrium sys-
tems. In Section 8.3 we discuss r elevant concepts and t ools that may be used for this
pur pose. In Section 8.4 we use sever al semiquantitat ive ap proaches to estimate the
value of the complexity of specific systems.
Our use of the word “complexity”is specified as an answer to the question, How
complex is it? We say, Its complexity is <number ><units>. Intuit ively, we can make a
connect ion between complexity and und erstanding. When we encounter something
new, whether p ersonally or in a scientific context, our objective is to under stand it.
The understanding enables us to use,modify, control or appr eciate it. We achieve un-
derstanding in a number of ways, through classification, description and ultimat ely
through the ability to predict behavior. Complexity is a measure o f the inherent dif-
ficulty to achieve the desired under standing. Simply stated, the complexity of a system
is the amount of informat ion necessary t o descr ibe it.
This is descript ive complexity. For dynamic systems the descript ion includes the
changes in the syst em over time. We will also discuss the response of a dynamic sys-
tem to its environment. The amount of information necessary to describe this re-
sponse is a system’s behavioral complexity. To use these definitions of complexity we
will introduce mathematical expressions based upon the theor y of information.
The qu a n ti t a tive def i n i ti on of i n form a ti on (Secti on 1.8) is rel a tively abstr act .
However, it can be measu red in familiar ter ms su ch as by t he nu m ber of ch a r acter s in a
tex t . As a prel i m i n a r y exercise in the discussion of com p l ex i ty, the re ader is invi ted to
exercise intu i ti on to esti m a te t he com p l ex i ty of a nu m ber of s ys tem s .Q u e s ti on 8 . 1 . 1
i n clu des a list of s ys tems that are de s i gn ed to sti mu l a te some thought abo ut com p l ex-
i ty as a qu a n ti t a tive measu re of the beh avi or of a sys tem . The re ader should devo te
s ome thought to t his qu e s ti on before proceeding with the rest of the tex t .
M o t i va t i o n 703
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 703
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 703
Q
ue s t i on 8 . 1 . 1 Estimate the complexity o f some of the syst ems in the
following list. For this question use an intuitive definition of complex-
ity—the amount of information that would be required to describe the sys-
tem or its b ehavior. We use units o f bits to measure information. However,
to make it easier to visualize, you may use other convenient units such as
words or pages of text. So, we can paraphrase the question as, How much
would you have to write to describe the system behavior? A rough conver-
sion factor of 1 bit per char acter can be used to conver t these estimates to
bits. It is not necessary to estimate the complexity of all the syst ems on the
list. Consider ing even a few of them is sufficient to d evelop an understand-
ing of some of the issues that arise. Indeed, for some of these systems a rough
estimate is far from t rivial. Answers to this question will be given in the text
in the remainder of this chapter.
Hi nt You may find that you would use different amounts o f infor ma-
tion depending on what aspects o f the syst em you are describing. In such
cases t r y to give more than one estimate or a range of values.
Physical Systems:
Ideal gas (1 mole at T · 0°K, P · 1at m)
Water in a glass
Chemical reaction
Brownian par ticle
Tur bulent flow
Protein
Virus
Bacterium
Immune system cell
Fish
Frog
Ant
Rabbit
Cow
Human being
Radio
Car
IBM 360
Personal Computer (PC/Macintosh)
The papers on your desk
A book
704 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 704
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 704
A librar y
Weather
The biosphere
Nature
Mathematical and Model Systems:
A number
Iter ative maps (growth, bifurcation to chaos)
1-D random walk
short time
long time
Ising model (ferromagnet)
Tur ing machine
Fractals
Sier pinski gasket
3-D random walk
Attr actor neur al networ k
Feedfor ward neural network
Subdivided attr actor neur al networ k 
Complexi t y of Ma t he ma t i ca l Mode ls
Complexity is a propert y of the relationship between a system and various represen-
tations of the system.Our object ive is to understand the complexity of systems com-
posed of physical entities such as at oms,molecules or cells. Abstract r epresentations
of such systems are described in terms of characters or number s. It is helpful to pref-
ace our discussion of physical systems with a discussion of the complexity of the char-
acter s or numbers that we use to represent them.
8 . 2 . 1 Informa t ion, comput a t ion a nd a lgorit hmic complexit y
The discussion of Shannon information theor y in Section 1.8 was based on st rings of
characters that were generated by a source. The source gener ates each string, s, by se-
lecting it from an ensemble. The informat ion from a part icular string was defined as
I · −log(P(s)) (8.2.1)
wher e P(s) is the probability of the st ring in the ensemble. If all st rings have equal
probability then this is the logarithm of the number of distinct str ings. The source it-
self (or the ensemble) was char acterized by the aver age information of a large num-
ber of st rings
(8.2.2)

<I > · − P(s) log(P(s))
s
∑
8 . 2
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 705
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 705
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 705
It was also possible to consider a more general source that selected characters to for m
a Markov chain. The probabilistic coupling between sequential character s reduced the
information content of the st r ing. It was possible to compress the st ring using a re-
ver sible coding algor ithm (computation) that would enable the same information to
be r epresented in a more compact for m. The length o f the shor test binary compact
form is equal to the average information in a str ing.
Information theor y suggests that we can define the complexity of a string of char-
acter s by the information content of the str ing. The information content is the same
as the length of the shor test binary encoding of the string. This is intuitive—since the
or iginal string can be obtained from its shortest representation,the same information
must be p resent in both. Within standard infor mation theor y, the encodings would
be limited to compression using a Markov chain model. However, more gener ally, we
could use any possible algor ithm for encoding (compressing) the string. Questions
about all possible algor ithms are precisely the domain of computation theor y. The de-
finition of Kolmogorov (algor ithmic) complexity of a st ring makes use of computa-
tion theor y to describe what we mean by “any possible algor ithm.” Allowing all algo-
rithms is the same as allowing more gener al models for the string than a Markov
chain. Our objective in this section is to develop an understanding of algor ithmic
complexit y beginning from the theor y of computation.
Computation theor y (Section 1.9) describes the oper ations of logic and compu-
tation on symbols. All the operations are deter ministic and are expr essible in ter ms of
a few elementary operations. The concept of universality of computation is based on
the understanding that a particular type of conceptual machine/computer—the uni-
versal Turing machine (UTM)—can perform all possible computations if the in-
st ructions are properly encoded as a finite str ing of char acters ser ving as the UTM in-
put. Since we have no absolute definition of computation,there is no complete proof.
The existing proof shows that the UTM can perform all computations that can be
done by a much larger class of machines—the Turing machines (TM). Other models
for computation have been shown to be essential ly equivalent to these TM.A TM is
defined by a table of elementary operations that act on the input st ring. The word
“pr ogram” can be used either to r efer to the TM table or to its input and so its use is
best avoided in this context.
We would like to define the algorithmic complexity of a str ing, s, as the length of
the shor test possible binary TM input, such that the output is s. The relationship of
this to the encoding and decoding of Shannon should be apparent. In order to use this
as a definition,there are several matters that must be cleared up. To summarize: There
are actually two sour ces of information when we use a TM, the input st r ing and the
table. We need to take both of them into account to define the complexit y. There are
many ways to define complexit y; however, we can prove that any two definitions of
complexity differ by no more than a constant. We will also show that no matter what
definition we use, most st rings cannot be compr essed.
In order to motivate the lo gic of the following discussion, it is helpful to think
about how we might approach compressing various st r ings of char acters. The shor t-
706 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 706
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 706
est compression should then be the complexity o f the st ring. One st r ing might be
formed out of a long subst ring of zeros followed by a long subst r ing of ones. This is
convenient to wr ite by indicating how many zeros followed by how many ones: N
0
N
1
.
We would make a binary st ring notation for N
0
N
1
and write a progr am that would
read this input and then ou tput the original st r ing. Another string might be a r epre-
sentation of the Fibonacci numbers (1,1,2,3,5,8,…), start ing fr om the N
0
st number
and ending at the N
1
st number. We could write this using a similar notation as the pr e-
vious one, but the program that we would write to generate the str ing is quite differ-
ent. Both pr ograms would be quite simple. Now imagine that we want to communi-
cate one of the original st rings to someone else. If we want to communicate it in
compressed form, we would have to send the program as well as the input. If there
were many st rings, we might be clever and send the programs only once. The prob-
lem is that with only the input str ing, the r ecipient would not know which p rogr am
to apply to obtain the o riginal st ring. We need to send an additional piece of infor-
mation that indicates which progr am to apply. The simplest way to do this is to assign
numbers to ea ch of the programs and p reface the p rogram input with the progr am
number. Once we do this, the st ring that we send uniquely determines the st r ing we
wish to communicate. This is necessar y, because if the interpretation of the transmit-
ted string is not unique,then it would be impossible to guarantee a cor rect interpr e-
tation. We now develop these thoughts using a more formal notat ion.
In what follows, the operation of a TM or a UTM will be indicated by functional
notation. The st ring that results from its application to a tape is indicated by U(s)
where s is the nonblank portion o f the tape (input str ing), U is the id entifier of the
TM,and the initial position of the TM head is assumed to be at the leftmost nonblank
character.
In o rder to define the complexity of a st r ing, we id entify a par ticular UTM U.
Then the complexity C
U
(s) of the string s is defined as the length of the shor test string
r such that U(r) · s. We call an input string r to U that gener ates s a representation of
s. Thus the length o f the shor test r epr esentation is C
U
( s). The centr al theorem of al-
gorithmic complexity r elates the complexity according to one UTM U and another
UTM U ′. Before we state and prove the theorem, we discuss several incidental mat -
ters.
We first ask whether we need to use a UTM and not just any TM in the defini-
tion. The answer is that the use o f a UTM is convenient,and we cannot significantly
improve the ability to compress strings by allowing the larger class of TM to be used
in the definition. Let us say that we have a UTM U and a TM V, we define a new
UTM W by:
W(0s) · V(s)
W(1s) · U(s)
(8.2.3)
—the first character indicates whether to use the TM V or the UTM U on the rest of
the input. Since the complexity according to the UTM W is at most one more than the
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 707
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 707
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 707
complexity according to the TM V, C
W
(s) ≤ C
V
(s) + 1, we see that using the larger class
of TM to define complexities can not impr ove our results for any part icular str ing by
more than one bit, which is not significant for long complex st rings.
We may be disturbed that the definition of complexity does not indicate that the
complexity of an incompressible st r ing is the same as the string length itself. Indeed
the definition does not r equire it. However, if we wanted to impose this as an auxil-
iary condition, we could define the complexity o f a st r ing using a slightly different
construction. Given a UTM U, we define a new UTM V such that
V(0s′) · s′
V(1s′) · U(s′)
(8.2.4)
—the first char acter indicates whether the str ing is compressed. We then define the
complexity C
U
(s) of any string s as one less than the length of the shor test string r such
that V(r) · s. This is not quite a fair definition, because if we wanted to communicate
the st r ing s we would have to indicate all of r, including its first bit. This means that
we should define the complexity as the length of r, which would be a sacrifice of at
most one bit for incompressible str ings. Limiting the complexity of a st ring to be no
longer than the str ing itself might seem a natural idea. However, we note that the
Shannon infor mation, Eq. (8.2.1), is r elated only to the probability o f a st ring, and
may be larger than the or iginal string length for a par ticular str ing.
Returning to our basic definition of complexity, we have described the existence
of a shor test possible r epr esentation of any str ing s, and a single machine U that can
reconstruct each s from this r epresentation. The key theorem that we need to p rove
relates the complexity defined using one UTM U to the complexity defined using an-
other UTM U′. The theorem is: the complexit y C
U
based o n U and the complexity
C
U ′
based on U ′ satisfy:
C
U
(s) ≤ C
U ′
(s) + C
U
(U′) (8.2.5)
wher e C
U
(U′) is independent of the string s. The proof of this expression results from
the ability of the UTM U to simulate U ′. To prove this we must improve slightly our
definition of complexity, or equivalently, we have to limit the UTM that are allowed.
This is discussed in Questions 8.2.1–8.2.3. It is shown there that we can pr eface binary
str ings input to the UTM U′ with a prefix that will make them gener ate the same out-
put when input to U. We might call this pr efix r
U,U ′
a t ranslation program,it satisfies
the propert y that for any string r , U(r
U,U ′
r) · U ′(r). Let r
U ′
be a minimal representa-
tion for U ′ of the st ring s. Then r
U,U ′
r
U ′
is a representation for U of the st ring s. The
length of this string must be great er than or equal to the length of the minimum string
r
U
necessar y to produce the same output:
C
U
(s) · | r
U
| ≤ | r
U,U ′
r
U ′
| · | r
U ′
| + | r
U,U ′
| · C
U ′
(s) + C
U
(U ′) (8.2.6)
C
U
( U′) · | r
U,U ′
| is the length of the translation pr ogram. We have proven the in-
equalit y in Eq. (8.2.5).
Q
ue s t i on 8 . 2 . 1 Show that there exists a UTM U
0
such that for any TM U
that accepts binary input, there is a str ing r
U
so that for all s and r
satisfying s · U(r), we have that s · U
0
(r
U
r) .
708 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 708
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 708
Hi nt One way to do this is to use a modified form of the constr uction
given in Sect ion 1.9. The new constr uction requires modifying the nature of
the UTM—i.e., a trick.
Solut i on 8 . 2 . 1 We call the UTM described in Section 1.9,
˜
U
0
. We can sim-
ulate the UTM U using
˜
U
0
; however, the form of the input string would not
quite satisfy the conditions of this theorem.
˜
U
0
has an input that looks like
r
U
r
t
(r), wher e the right part is only a funct ion of the input st ring r and the
left part is only a function of the UTM U. However, the tape part of the rep-
resentation r
t
(r) uses a doubled binary for m for characters and marker s be-
tween them so that it is not the same as the original tape. We must r eplace
the tape part of the representation with the o riginal st ring in or der to have
an input string of the form r
U
r.
Both
˜
U
0
and U have binary input strings. This means that we might tr y
to use the tape of U without modification in the tape part of the representa-
tion given in Section 1.9. Then there would be no d elimiters between char-
acters and no doubled binary representation. There is, however, one diffi -
culty. The UTM U
0
must keep tr ack of where the current position of the
UTM U would be during the same calculation. This was accomplished in
Section 1.9 by conver ting one of the M
1
markers to M
6
at the cur rent loca-
tion of the UTM U. There are a number of ways to overcome this problem,
but all r equire us to introduce something new. We will do this by allowing
the UTM U
0
to have a counter that can keep t rack of the current position of
the UTM U. There are two ways to argue this.One is to allow, by proclama-
tion, a counter that can reach ar bitrarily high numb ers. The other is to r ec-
ognize that the longest st ring we might conceivably encounter is smaller
than the number of par ticles in the known universe, or ver y roughly
10
90
· 2
300
. This means that we can use an inter nal memory of 300 bits to rep-
resent such a count er. This count er is initialized to 0 and set to the current
location of the UTM U at every st ep o f the calculation. This constr uction
gives us the desired UTM U
0
. 
Q
ue s t i on 8 . 2 . 2 Using the result of Question 8.2.1, prove Eq.(8.2.5). See
the text for a hint.
Solut i on 8 . 2 . 2 The problem is that Eq.(8.2.5) is not actually correct for all
UTM (see Question 8.2.3) so we need to modify our conditions. In a sense,
the modification is minor because we only improve the definition slightly.
We do this by defining the complexity C
U
(s) f or an arbitrary UTM as the
minimum length of r such that W(r) · s where W is defined by:
W(0s) · U
0
(s)
W(1s) · U(s)
(8.2.7)
—the first bit specifies whether to use U or the special UTM U
0
const ructed
in Question 8.2.1. C
U
(s) d efined this way is at most one bit more than our
previous definition, for any par ticular string. It might be significantly
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 709
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 709
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 709
smaller. This should not be a problem, because our objective is to find shor t
representations of str ings. By using our special UTM U
0
in this definition, we
guarant ee that for any two UTM U and U ′, whose complexity is defined in
terms of W and W′ by Eq.(8.2.7), we can wr ite W(r
WW ′
r
W′
) · W′(r
W
). This
is possible because W inherits the p ropert ies of U
0
when the first char acter
of its input st ring is 0. 
Q
ue s t i on 8 . 2 . 3 Show that some form o f qualification o f Eq. (8.2.5) is
necessary by demonstr ating that there exists a UTM that does not satisfy
this inequalit y. Therefore, Eq. (8.2.5) cannot be extended to all UTM.
Solut i on 8 . 2 . 3 One possibility is to have a UTM that uses only cer tain char-
acter s in its input string. Specifically, define a UTM U that acts the same as a
UTM U′ but uses only ever y other character in its input str ing: U(r) · U ′( r ′)
if r is any string whose odd characters are the characters of r ′. The complex-
ity o f a st r ing a ccording t o U is twice the complexity according t o U ′ and
therefore Eq. (8.2.5) is in valid in this case. With the modified definition of
complexit y given in Quest ion 8.2.2 this is no longer a problem. 
Switching U and U ′ in Eq. (8.2.5) gives a similar inequality with a constant
C
U ′
(U ). Defining the larger of the two translat ion program lengths to be
C
U,U ′
· max(C
U
(U ′),C
U ′
(U)) (8.2.8)
we have proven that complexities defined by the UTM differ by no more than C
U,U′
:
|C
U
( s) − C
U ′
(s)| ≤ C
U,U ′
(8.2.9)
Since this constant is independent o f the complexity o f the st r ing s, it b ecomes in -
significant for large enough complexities. Thus, for st rings that are complex enough,
it doesn’t matter which UTM we use to define its complexity. The complexity defined
by one UTM is the same as the complexity d efined by another UTM. This consis-
tency—universality—in the complexity of a string is essential in order for it to be well
defined. We will use a few examples to illustrate the nature of universality provided by
this definit ion.
The first example illustr ates the relationship of algor ithmic complexity to string
compression.Given a str ing s we can ask what methods of compression are useful for
the string. A useful compression algorithm cor responds to a patter n in the characters
of the string. A string might have many repetitive digits, or cyclically repeating digits.
Alter natively, it might be a sequence that can be generated using simple mathemat i-
cal operations such as the Fibonacci series, or the digits of . There are many such pat-
ter ns that are r elevant to the compression of st rings. We can choose a finite set of N
algorithms {V
i
}, where each one is represented by a TM that reconst ructs a st ring s
from a shorter string r by taking advantage of proper ties of the pattern. We now con-
str uct a new TM U which is defined by:
U(r
i
r ′) · V
i
(r ′) (8.2.10)
710 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 710
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 710
where r
i
is a binary representation of the number i , having log(N ) bits. This is a UTM
if any of the V
i
is a UTM or it can be made into a UTM by Eq. (8.2.3). We use U to
define the complexit y C
U
(s) of any st r ing as described above. This complexity in-
cludes both the length of r ′ and the number of bits (log( N )) in r
i
that t ogether con-
stit ute the length of the input r to U. Once it is defined,this complexity is a measur e
of the complexity of all st rings. We do not use different TM to define the complexit y
of each st r ing; one UTM is used to define the complexit y of all str ings.
Despite the message of the last example,let us assume that we are evaluating the
complexit y of a par ticular st ring s. We define a new UTM U
s
by:
U
s
(0s′) · s
U
s
(1s′) · U(s)
(8.2.11)
—the first char acter tells U
s
if the str ing is s. We can use this new UTM to d efine the
complexity of all strings and for this definition the complexity of s is one. How does
this r elate to our the orem about the universality of complexity? The point is that in
this case the t ranslation program between U and U
s
contains the complete informa-
tion about s and therefore must be at least as long as C
U
(s). What we have done is to
take the particular st ring s and inser t it into the table of U
s
. We see in this example
how universality is tied to an assumption that the complexities that are discussed are
longer than the TM t ranslation progr ams or, equivalently, the information in their ta-
bles. Conceptually, we would say that universality of complexity is tied to an assump-
tion of lack o f specific kno wledge on the part of the r ecipient (r epresented by the
UTM) of the infor mation itself. The choice of a particular UTM might be dictated by
an implicit und er standing of the set o f str ings that we would like to represent, even
though the complexity of a string is defined without reference to an ensemble of
st rings. However, this appar ent r elativism o f the complexity is limited by our basic
theorem that relates the complexity of distinct UTM,and by additional results about
the impossibility of compr essing most strings discussed in the following paragraphs.
We have gained an additional result from the const ruction of a single UTM that
gener ates all str ings from their compressed forms. This is that a representation r only
represents one st ring s. We can now prove that the probability that a st r ing of length
N can be compressed is ver y small. The proof proceeds from the observation that the
number of possible strings decreases ver y rapidly with decreasing str ing length. A
st ring s of length |s| · N compressed by k bits is represented by a particular str ing r of
length |r | · C(s) · N − k. Since there are only 2
N−k
str ings of length N − k, at most 2
N−
k
st rings of length 2
N
can be compressed by k bits. The fractional compression is k/N.
For example,among all st rings of length 10
6
bits,at most 1 st r ing in 2
100
· 10
30
can be
compressed by 100 bits or .01% of the string length. This is not a ver y significant com-
pression. Even so, this estimate of the a verage numb er o f st rings that can be com-
pressed is much t oo large, because st r ings that are not of length N, e.g., st rings of
length N − 1 N − 2, …, N − k, would also be r epresented by st r ings of length N − k.
Thus most st rings are incompressible. Moreover, selecting a st ring at random will
yield an incompressible str ing.
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 711
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 711
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 711
Q
ue s t i on 8 . 2 . 4 Calculate a st rict lower bound for the aver age complex-
ity of strings of length N.
Solut i on 8 . 2 . 4 We assume that st r ings of length N are compressed so that
they are r epresented by all o f the shor test st rings. One st ring is r epresented
by the null str ing (length 0), two st rings are represented by a single bit
(length 1), and so on. The relationship:
(8.2.12)
means that we will fill all of the possible strings up to length N − 1 and then
have one str ing left of length N. The average r epresentation length for any
complexit y measure must then satisfy:
(8.2.13)
The sum can be evaluated using a table of sums or:
(8.2.14)
giving:
(8.2.15)
Thus the average complexity o f str ings o f length N cannot be reduced by
more than two bits. This str ict lower bound applies to all measures of
complexit y. 
We can also inter pret this discussion to mean that the best UTMs to use to define
complexity are those that are inver tible—they have a one-to-one mapping of st rings
to representations. In this case we have a mapping r (s) which gives the unique repre-
sentation of a str ing. The reason that such UTM are better is that there are only a lim-
ited number of representations shorter than N ; if we use up mo re than one of them
for a particular string, then we will have fewer repr esentations to use for others. Such
UTM are closely analo gous to our understanding of encoding and decoding as de-
scribed in infor mation theor y. The UTM is the decoder and the mapping of the str ing
onto its representat ion is the encoding.
Because most strings are incompressible, we can also prove that if we have an en-
semble of str ings defined by the p robability P(s), then the aver age algorithmic com-
plexity of these st rings is essentially the same as the Shannon infor mation. In par tic-
ular, the ensemble of all of the st rings of length N have a Shannon information of N
bits and an aver age algorithmic complexity which is the same. The catch is recogniz-

2
N
· 2
l
l·0
N −1
∑
+1
712 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 712
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 712
ing that to specify P(s) itself requires an algorithm whose complexity must enter into
the discussion. The pr oof follows from the discussion in Section 1.8. An ensemble de-
fined by a probability P( s) can be encoded in such a way that the average str ing length
is given by the Shannon information. We now realize that to define the st r ing com-
plexity we must include the descr iption of the decoding operation:
(8.2.16)
where the exp ression C( P) r epresents the complexity of the decoding oper ation for
the universal computer U for the ensemble given by P(s). C(P) depends in part on the
algor ithm used to specify the ensemble probabilit y P(s). For the aver age ensemble
complexity to be essentially equal to the average Shannon information,the specifica-
tion of the ensemble must itself be simple.
For Markov chains a similar result applies—the Shannon infor mation of a str ing
representing a Mar kov chain is the same as the algorithmic complexity of the same
string, as long as the algor ithm specifying the Mar kov chain is simple.
A general consequence of the definition of algorithmic complexity is a limitation
on what TM can do. No TM can gener ate a str ing more complex than the input string
that it is p rovided with, plus the information in its table—otherwise we would have
redefined the complexity of the output string to take this into consideration. This is a
key limitation of TM: TM (and computers that are realizations of this model) cannot
generate new infor mation. They can only process information they are given. As dis-
cussed briefly in Section 1.9.7, this limitation can be over come by a TM that is given
a st ring o f random bits as input. The infinit ely complex input means the limitation
does not apply. It remains to be demonstrated what tasks such a TM can per for m that
are not possible for a conventional TM. If such tasks are ident ified,t here will be im-
portant implications for computer design. In this context, it may also be suggested
that some forms of creativity might be linked to the a vailability of randomness (see
Section 1.9.7). We will retur n to this issue at the end of the chapter.
While the definition of complexity using UTM is ap pealing, there is a profound
difficulty with this proof. It is nonconstr uctive. No method is given to determine the
complexity of a par ticular st ring. Indeed, it can be proven that this is a fundamen-
tally difficult task—the time necessary for a TM to determine C(s) grows exponen-
tially with the length of s. At least this is t rue when there is a bound on the complex-
it y, e.g., by Eq. (8.2.4). Other wise the complexity is noncomputable. We find the
complexity of a st r ing by t r ying all input st rings in the UTM to see which one gives
the necessary output. If the complexity is not bounded, then the halting problem
implies that we cannot t ell if the UTM will halt on a part icular input,thus it is non-
computable. If the complexity of the st ring is bound ed, then we only tr y st rings up
to this bound, and it is possible to deter mine if the UTM will halt for memb ers o f
this bounded set of st rings. Nevertheless, tr ying each st r ing requires a time that
grows exponentially with the bound, and therefore is not practical except for a few
ver y simple str ings. The pr ocess of finding the complexity of a str ing is akin to a
process of t rying models for the st ring. A model is a TM that might, when given the

P(s)C(s)
s
∑
· P(s)I
s
s
∑
+C( P)
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 713
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 713
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 713
proper input, generate the st r ing. It is possible to t r y many models. However, to de-
termine the actual compressed st r ing may not be practical in any reasonable time.
With any par ticular set of models, we can, however, find an upper bound on the
complexity of a st r ing. One of the possible models is that of a Markov chain as used
by Shannon information theor y. Algorithmic complexity allows more general TM
models. However, by our discussion it is improbable that a randomly chosen st r ing
will be compressible by any algor ithm.
In summar y, the universality of complexity is a stat ement that the use of differ-
ent UTMs in the definition of complexity affects the result by no more than a con-
stant. This constant is the length of the program that tr anslates the input of one UTM
to the other. Significantly, the mo re complex the st ring is, the mo re univer sal is the
value o f its complexity. This follows because the length of translation programs b e-
comes less and less relevant for longer and longer descriptions/r epresentations. Since
we are interested in proper ties of complex syst ems whose descriptions are long, we
can, with caution, rely on the universality of their complexit y. This is not the case with
simple systems whose descriptions and therefore complexities are “subjective”—they
depend on the conventions for descript ion. These conventions, in our mathematical
definition,are represented by the choice of UTM used to define complexity. We also
showed that most strings are not compressible and that the Shannon information
measure is the same as the average algor ithmic complexity for all concisely describ-
able ensembles. In what follows,unless otherwise mentioned, we assume a part icular
definition of complexit y C(s) using the UTM U.
8 . 2 . 2 Ma t hema t ica l syst ems: numbers a nd funct ions
One of the difficulties in discussing complexity is that many elementary mathemat i-
cal constructs have unusual proper ties when considered fr om the point o f view o f
complexity. Philosopher s have been t roubled by these points,and they have been ex-
tensively d ebated o ver the centuries. Most o f the p r oblems revolve around various
forms of infinity. Unlimited numbers and infinite precision oft en simplify symbolic
mathematical discussions;however, they are not well behaved from the point of view
of complexity measures. There appears to be a paradox here that will be clarified when
we distinguish between the complexity of a set of numbers and the complexity of an
element of the set.
Let us consider the complexity of specifying a single int eger. The difficulty with
integers is that there are infinitely many of them. Using an infor mation theor y point
of view, assigning equal probability to all integers would imply that any particular in-
teger would have no probability of occur r ing. If I ask you to give me a posit ive int e-
ger, from 1 to infinity with equal probability, there is no chance that you will give me
an integer below any part icular cutoff value,say N. This means that you will need ar-
bit rarily many digits to specify the integer, and there is no limit to the information re-
quired. Thus the c omplexity of specifying a single int eger is infinite. However, if we
allow only integers between 1 and a large posit ive number—say N · 10
90
, roughly the
number of elementary particles in the known universe—the complexity of specifying
one of the integers is only log( N ),about 300 bits. The drastic differ ence between the
714 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 714
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 714
complexity of specifying an ar bitrary integer (infinite) and the complexity of an enor-
mously large number of integers (300 bits) suggests that systems that are easy to d e-
fine may be highly complex. The whole field of number theor y has shown that int e-
gers are not as simple as they first appear. The measure of complexity of specifying a
single int eger may appear to be far from more abst ract discussions like those of the
halting problem or Gödel’s theorem (Section 1.9.5),however, they are related. This is
apparent since these theorems do not apply to finite sets.
In what sense are integers simple? We can consider the length of a UTM input
str ing that can generate all the posit ive int egers. As discussed in the last section, this
is similar to the d efinition of their Kolmogorov or algorithmic complexity. The pro-
gr am would, star ting fr om zero and keeping a list, progressively add one to the pre-
ceding int eger. The p roblem is that such a program ne ver halts, and the task is not
complete. We can gener alize our definition of a Turing machine to allow for this case
by saying that, by definition, this simple program is generating all int egers. Then the
algorithmic complexity of the integer s is quite small. Another way to do this is to con-
sider the complexity of recognizing an integer—the recognition complexity.
Recognizing an int eger is trivial if we are considering only binary st rings, because all
of them repr esent integers. The point ,however, is that we can expand the space of pos-
sible characters to include various symbols:lett ers,punct uation, mathematical oper-
ations, etc. The mathematical operations might act upon int egers. We then ask how
long is a TM progr am that can recognize any int eger that appears as a combination of
such characters. The length of such a program is also small.
We see that we must distinguish between the complexity of elements of a set and
the set itself. A pr ogram that recognizes int egers is concerned with the att ributes of
the integers required to define them as a set, rather than the specification of a partic-
ular integer. The algorithmic complexity of the set of all integers is small even though
the infor mation contained in a single integer can be arbitr arily large. This distinction
between the infor mation contained in an element of a set and the information neces-
sary to define the set will also be impor tant when we consider the complexity of phys-
ical systems.
The complexity of a single real number is also infinite. Specifying an arbit rar y
real number requires infinit ely many digits. However, if we confine ourselves to an y
reasonable precision, the complexity becomes ver y manageable. For example, the
most accurately known fundamental constant in science is the elect ron magnet ic mo-
ment in Bohr magnetons
e
/
B
· 1.001159652193(10) (8.2.17)
where the parenthesis indicates the error estimate, cor responding to 11 accurate
decimal digits or 37 binary digits. If we consider 1 −
e
/
B
we immediately lose
3 decimal digits. Thus, similar to int egers, the pract ical complexity of a real number
is not ver y large.
The discussion of integers and reals suggests that under practical circumstances
a single number is not a highly complex object .Generally, the complexity of a system
arises because of the presence of a large number of parameters that must be specified.
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 715
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 715
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 715
However, there is only reason to consider them collectively as a system if they are cou-
pled to each other.
The next categor y of mathematical objects that we consider are funct ions. To
specify a function f ( s) we must either describe its operation by a formula or sp ecify
its action on each possible argument. We consider Boolean functions (functions with
binary output, see Section 1.9.2), f (s) · t1, of a binary st ring, s · (s
1
s
2
. . . s
Ne
). The
number of arguments of the function—input bits—is N
e
. There are 2
N
e
possible
values of the input st ring. For each of these there are two possible outcomes (output
values). All Boolean functions may be sp ecified by listing the binary output for each
possible input state. Each possible out put is independent. The numb er o f different
Boolean functions is the number of possible sets o f outputs which is 2
2
N
e
. Assuming
that all of the possible Boolean functions are equally likely, the complexity of a
Boolean function (the amount of information necessary to specify it) is the logarithm
of this number or C( f ) · 2
N
e
. The r epresentation of a Boolean funct ion in t erms of
C( f ) binary variables can also be made explicit as a st ring r epresenting the presence
or absence of terms in the disjunctive normal for m described in Section 1.9.2.
A binary function with N
a
outputs is the same as N
a
independent Boolean func-
tions. If we assume that all possible combinations of Boolean funct ions are equally
likely, then the total complexity is the sum of the complexit y of each, or
(8.2.18)
The asymmetr y between input and output is a fundamental one. It arises because we
need to specify for each possible input which of the possible outputs is output.
Specifying “which” is a logarithmic oper ation in the number of possibilities, and
therefore the influence of the ou tput space on the complexity is logarithmic com-
pared to the influence of the input. This discussion will be generalized lat er to con-
sider a physical system that acts in response to its environment. The environment will
be specified by a number of binary variables (environmental complexity) N
e
, and its
act ions will be specified by a number of binar y variables (act ion complexit y) N
a
.
Comple xi t y of Phys i ca l Sys t e ms
In order to apply our understanding of the complexity of mathematical const ructs to
physical syst ems, we must develop a fundamental understanding of representations.
The complexity of a physical system is to be defined as the length of the shor test
st ring s that can represent its proper ties—the results of possible measurements/
obser vations. In Section 8.3.1 we discuss the relationship between thermodynamics
and information theor y. This will enable us to define the complexity of ergodic and
nonergodic systems. The resulting infor mation measure is essentially that of Shannon
information theor y. When we c onsider algor ithmic complexit y, we can ask whether
this is the smallest amount of information that might be used. This is discussed in
Section 8.3.2. Sect ion 8.3.3 introduces the complexity profile, which measures the
complexity as a funct ion of the scale of observation. Implications of the time scale of
obser vation, for chaotic dynamics, are discussed in Sect ion 8.3.4. Sect ion 8.3.5
8 . 3

C( f ) ·N
a
2
N
e
716 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 716
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 716
discusses examples and proper ties of the complexity p rofile. Sections 8.3.1 through
8.3.5 are based upon descriptive complexit y. To bett er account for the b ehavior of a
system in response to its environment we consider behavioral complexity in
Section 8.3.6. This turns out to be closely related to descriptive complexity. Other is-
sues related t o the role of the obser ver are discussed in Sect ion 8.3.7.
8 . 3 . 1 Ent ropy a nd t he complexit y of physica l syst ems
The definition of complexity of a system requires us to develop an understanding of
the relationship of information to the physical propert ies of a system. The most direct
relationship is the relationship of entropy and infor mation. At the outset,it should be
understood that these are very different concept s.Ent ropy is a specific physical prop-
ert y of systems that are in equilibr ium, or are in well-defined ensembles. Information
is not a unique physical proper t y. Instead it is related to representations of digits.
Infor mation can be a pr opert y of a time sequence or any other set of degrees of free-
dom. For example, the infor mation content of a set o f char acters written on a piece
of paper can be given. The entropy, however, would be largely a proper ty of the paper
or the ink. The ent ropy of paper is difficult to determine pr ecisely, but simpler sub-
stances have entropies that have been deter mined and are tabulat ed at sp ecific tem-
peratures and pressures. We also know that entr opy is conser ved in reversible adia-
batic processes and increases in ir reversible ones.
Despite the significant conceptual difference between information and ent ropy,
the formal definition of information discussed in Section 1.8 appears ver y similar to
the definition of entropy discussed in Section 1.3. Thus, it makes sense that the two
are related when we develop an und erstanding of complexity. It is helpful to review
the definitions. The entropy was defined first for the microcanonical ensemble, which
specifies the macroscopic energy U, number of par ticles N, and volume V, of the sys-
tem. We assume that all states (microstates) of the system with this energy, number of
par ticles and volume ar e equally likely in the ensemble. The entropy was wr itten as
S · k ln (U, N, V) (8.3.1)
wher e (U, N, V) is the number of such states. The coefficient k is defined so that the
units of entropy are consistent with units of energy and temper ature for the ther mo-
dynamic relationship T · dU /dS.
In for m a ti on was def i n ed for a str ing of ch a racter s . G iven the prob a bi l i ty of t h e
s tr ing of ch a racters , the inform a ti on is def i n ed by Eq . ( 8 . 2 . 1 ) . The loga rithm is taken
to be base 2 so that the inform a ti on is measu red in units of bi t s . We see that the infor-
m a ti on con tent is rel a ted to sel ect ing a single state out of an en s em ble of po s s i bi l i ti e s .
We can relate the two definitions in a mathematically direct but conceptually sig-
nificant way. If we want to specify a particular microstate of a thermodynamic system,
we must select this microstate from the whole ensemble. The probability of this par-
ticular state is given in the microcanonical ensemble by P · 1/ . If we think abou t
the state of the system as a message containing infor mation, we can use Eq.(8.2.1) to
give the amount of infor mation as:
I({x,p}| ( U, N,V)) · S(U, N,V) /(k ln2) (8.3.2)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 717
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 717
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 717
This expr ession should be understood as the amount of information contained in a
microstate {x,p}, when the syst em is in the macrostate sp ecified by U, N,V—it is also
the information necessary to describe precisely the microstate. This is the fundamen-
tal relationship we are looking for. We review its meaning in terms of the description
of a part icular idealized physical system.
If we want to describe the microstate of a system, like a gas of par ticles in a box,
classically we must specify all of the positions and momenta of the part icles {x
i
,p
i
}. If
N is the number o f par ticles, then there are 6N coordinates, 3 position and 3 mo-
mentum coordinates for each particle. To specify exactly the position of each particle
appears to require ar bitr ary precision in these coordinates. If we had to sp ecify even
a single position exactly, it would take an infinite numb er of binary digits. However,
quantum mechanics is inherently gr anular, thus there is a smallest distance ∆x within
which we do not need to specify one position coordinate of a particle. The part icle lo-
cation is uniquely given once it is within a region ∆x. More cor rectly, the particle must
be located within a region of position and momentum of ∆x∆p · h, where h is
Planck’s constant. The granularity defines the precision necessary to specify the posi-
tions and momenta, and thus also the amount of information (number of bits)
needed in order to describe completely the microstate. The definition of the entropy
takes this into account, other wise the counting of possible microstates of the system
would be infinite. The complete calculation of the entropy (which also takes into ac-
count the indistinguishability of the part icles) is given in Question 1.3.2. We now rec-
ognize that the calculation of the entropy is precisely a calculation of the information
necessar y to descr ibe the microstate.
There is another way to think about the relationship of entropy and infor mation.
It follows from the recognition that the number of states of a str ing of
I({x,p}| ( U, N,V)) bits is the same as the number of states of the system. If we consider
a mapping of system states onto str ings, the st r ings enumerate or label the system
states. If there are I({x,p}| ( U, N,V)) bits in each string, then ther e is a one-to-one map-
ping of system states onto the st r ings, and a st r ing uniquely identifies a system state.
We say that a st ring represents a system microstate.
We thus identify the ent ropy of a physical syst em as the amount o f infor mation
necessary to identify a single microstate fr om a specified macroscopic ensemble. For
an ergodic macroscopic syst em, this definition is a robust one. It does not matter if
we consider a typical or an aver age amount of information. What happens if the sys-
tem is nonergodic? There are two kinds of nonergodic systems we will discuss: a
magnet with a well-defined magnetization below its order ing phase transition (see
Section 1.6), and a glass where there are many fr ozen coordinates describing the lo -
cal arrangements of atoms (see Section 1.4). Many of these coordinates do not
change during the time of a t ypical exper iment. Should we include the inf ormation
necessary to specify the frozen variables as part of the entropy? We would like to sep-
arate the discussion of the frozen variables from the fast ones that are in equilib-
r ium. We use the entropy S to refer to the fast ensemble—the enumer ation of the ki-
netically accessible states of the system. The same function of the frozen variables we
will call C.
718 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 718
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 718
For the magnet, the amount of information contained in frozen variables is
small. For the Ising model of a magnet (Section 1.6), below the magnetization transi-
tion only a single binary variable is necessary to specify if the system magnetization is
UP or DOWN. We treat the magnet by giving the information about the magnetization
explicitly as part of the ensemble description. The amount of infor mation is insignif-
icant compared to the information in the microstate of a system,and ther efore is gen-
erally ignored.
In contr ast, for a glass,the amount of infor mation that is included in the frozen
variables is large. How does this information relate to the thermodynamic treatment
of the system? The conventional thermodynamic theor y of phase transitions does not
consider the exist ence of frozen information. It is designed for systems like the mag-
net, where this information is insignificant, and thus it does not apply to the glass
tr ansition.A different theor y is necessary which includes the change from an ergodic
to a nonergodic system, or a change from infor mation in fast variables to information
in frozen variables. Is there any relationship between the frozen infor mation and the
entropy? If they are r elated at all, there are two intuitive possibilities. One is that we
must sp ecify the fr ozen variables as part of the ensemble, and the amount of infor-
mation necessary to describe the fast variables is just as large as ifthere were no frozen
variables. The other is that the frozen variables balance against the fast variables so
that when there is more frozen information ther e is less infor mation in the fast var i-
ables. In order to deter mine which is cor rect, we will need to consider an exper iment
that measures both. As long as an experiment is being performed in which the frozen
variables never change, then the amount of infor mation in the frozen variables is
fixed. Thermodynamic experiments only depend on entropy differ ences. We will need
to consider an exper iment that changes the frozen variables—for example,heating up
a glass until it becomes a liquid or cooling it fr om a liquid to a glass. In such an ex-
periment the frozen infor mation must be accounted for. The difficulty with a glass is
that we do not have an independent way to deter mine the amount of frozen infor-
mation. For tunately, there is another system where we do.
Ther e is an intermediate example b etween a magnet and a glass that is of con-
siderable interest. The st ructure of ice has a glasslike fr ozen disorder of its hydrogen
atoms below approximately 100°K. The simplest way to think about this disorder is
that it arises from a choice of or ientations of the water molecule around the position
of the oxygen at om. This means that there is a macroscopic amount o f information
necessary to specify the static str ucture of ice. The amount of information associated
with this disorder can be calculated directly using a model for the str ucture of ice that
takes into account the cor relations between molecular orientations that are needed to
for m a self-consistent hydrogen structure within the oxygen lattice.A first estimate is
based on an average o f 3/ 2 or ientations per molecule or C · Nk ln(3/ 2) · 0.806
cal/moleK. A r eview of better calculations is given in a book by Fletcher. The best is
C · 0.8145 t 0.0002 cal/mole°K. The other calculation we need is the amount of en-
t ropy in steam. This can be obtained using a slight modification of the ideal gas cal-
culation,that takes into account the rotational and internal vibrational motion of the
water molecule.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 719
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 719
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 719
The key exper iment is to measure the change in the entropy of the syst em as a
funct ion of temper ature as it is heat ed from ice all the way to steam. We find the en-
tropy using the standard thermodynamic relat ionship (Section 1.3)
q · TdS (8.3.3)
where q is the heat added to the system. At close to a temper ature o f zero d egrees
Kelvin ( T · 0K) the entropy is zero because all motion stops, and there is only one
possible state of the system. Thus we would expect
(8.3.4)
—the total amount of entropy added to the system as it is heated up should be the
same as the entropy of the gas. However, experimentally there is a difference of 0.82 t
0.05 cal/moleK between the two. This is the amount of entropy in the gas that was not
added to the system as it was heat ed. The coincidence of two number s—the amount
of entropy missing and the calculation of the infor mation in the fr ozen st r ucture of
the hydrogen atoms, suggests that the missing entr opy was present in the original state
of the ice.
(8.3.5)
This in turn implies that the information in the frozen degrees of freedom was tr ans-
ferr ed (but conserved) to the fast degrees of freedom. Eq.(8.3.5) is not consistent with
the standard ther modynamic r elationship in Eq. (8.3.3). Instead it should be modi-
fied to read:
q · Td(S + C ) (8.3.6)
This should be under stood as implying that adding heat to a system increases the in-
for mation either of the fast or frozen variables. Adding heat (e.g., to ice) increases the
temper ature of the system,so that fewer variables are frozen. In this case C decreases
and S increases mo re than would be given by the conventional r elationship o f Eq.
(8.3.3). When heat is not added to a system, we see that there can be processes that
change the number of fast degrees of freedom and the number of static degrees of free-
dom while leaving their sum the same. We will consider this further in later sections.
Eq. (8.3.6) is important enough to p resent it again from a different perspect ive.
The discussion will help demonstr ate its validity by using a theoret ical argument
(Fig. 8.3.1). Rather than considering it from the point of view of heating ice till it be-
comes steam, we consider what happens either to ice or to a glass when we cool it
down through the tr ansition where degrees of freedom become frozen. In a theoreti-
cal description we start,ab ove the freezing-in tr ansition, with an ensemble of systems.
As we cool the system we remove heat,and this is reflected in a decrease in the num-
ber of possible states of the system. We think of this as a shrinking o f the number of
elements of the ensemble. However, as we go through the freezing-in t ransition, the

S(T) ·C(T ·0) +
q
T
0
T
∫

S(T) · q / T
0
T
∫
720 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 720
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 720
ensemble breaks up into disjoint pieces that can not make tr ansitions to each other.
Any particular mater ial must be in one of the disjoint pieces. Thus for a par ticular ma-
terial we must t rack only part of the original ensemble. For an incr emental decrease
in temperature due to an incremental removal of heat, the information needed to
identify (describe) a part icular microstate is the sum of the information necessary to
describe which o f the disjoint parts of the ensemble the system is in, plus the infor-
mation needed to specify which of the microstates the system is in once its ensemble
fragment has been sp ecified. This is the meaning of Eq. (8.3.6). The information t o
specify the ensemble fragment was t ransfer red from the entropy S to the ensemble in-
for mation C. The r eduction of the ent ropy, S, is not r eflected in the amount of heat
that is removed.
We are now in a position to give a first definition of complexity. In order to de-
scribe a system and its behavior over time, we must describe the ensemble it is in. This
information is given by C/ k ln(2). If we insist on describing the microstate of the sys-
tem, we must add the information contained in the fast degrees of freedom S/ k ln(2).
The question is whether we should insist on describing the microstate. Typically, the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 721
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 721
Title: Dynamics Complex Systems
Shor t / Normal / Long
*
T
1
T
2
T
3
T
4
Fi gure 8 . 3 . 1 Sch e ma t ic illust ra t ion of t h e e ffe ct on mot ion in ph a se spa ce of coolin g t h rough
a gla ss t ra n sit ion . Above t h e gla ss t ra n sit ion ( T
1
, T
2
a nd T
3
) t h e syst e m is e rgodic — it e x-
plore s t h e e n t ire ph a se spa ce . Coolin g t h e syst e m ca use s t h e ph a se spa ce t o sh rin k smoot h ly.
Th e e n t ropy, t h e loga rit h m of t h e volume of ph a se spa ce , de cre a se s. Be low t h e gla ss t ra n si-
t ion , T
4
, t h e syst e m is n o lon ge r e rgodic a n d t h e ph a se spa ce bre a ks up in t o pie ce s. A pa r-
t icula r syst e m e xplore s on ly on e of t h e pie ce s. Th e t ot a l a moun t of in forma t ion n e ce ssa ry t o
spe cify a pa rt icula r microst a t e ( e . g. in dica t e d by t h e *) is t h e sum of C/ k ln ( 2) , t h e in for -
ma t ion n e ce ssa ry t o spe cify wh ich pie ce , a nd S/ k ln ( 2) , t h e in forma t ion n e ce ssa ry t o spe cify
t h e pa rt icula r st a t e wit h in t he pie ce. 
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 721
whole point of describing an ensemble is that we don’t need to specify the part icular
microstate. We will return to address this question in greater detail later. However, for
now it is reasonable to consider describing the system to be specifying just the en-
semble. This implies that the information in the frozen variables C/ k ln(2) is the com-
plexity. For a thermodynamic syst em in the micr ocanonical ensemble, the complex-
ity would be given by the (small) number of bits in the specification o f the three
variables ( U, N,V ) and the number of bits necessary to specify the t ype o f element
(atom,molecule) that is present. The actual amount of infor mation seems not to be
precisely defined. For example, we have not identified the number of bits to be used
in specifying ( U, N,V ). As we have seen in the discussion of algor ithmic complexity,
this is to be expected, since the conventions of how the inf ormation is sp ecified ar e
cr ucial when there is only a small amount.
We have learned from this discussion that for a nonergodic system, the com-
plexity (the frozen ensemble information) is bounded by the sum over the numb er
of fast and static degrees of freedom ( C + S > C). For mat erial syst ems, we know in
principle how to measure this. As in the case o f ice, we heat up the syst em to the va-
por phase where the entropy can be calculated,then subt ract the entropy added dur-
ing the heating process. This gives us the value of C + S at the temperature from
which the heating began. If we know that C >> S, then the result is the complexity it-
self. In order for this t echnique to work at all, the complexity must be large enough
so that experimental accuracy can enable its measurement. Estimates we will give
later imply that complexities of biological organisms are too small to be measured in
this way.
The concept of frozen degrees of freedom immediately raises the question of the
time scale in which the experiment is performed. Degrees of freedom that are frozen
on one time scale are not on sufficiently longer ones. If our time scale of obser vation
would be ar bitrarily long, we would always describe syst ems in equilibr ium. The en-
tropy would then be large and the complexity would be negligible.On the other hand,
if our time scale of obser vation was extremely shor t so that microscopic motions were
detected, then our complexity would be large and the ent ropy would be negligible.
This motivat es the introduct ion of the complexit y profile in Section 8.3.3.
Q
ue s t i on 8 . 3 . 1 Calculate the information necessary to specify the mi-
crostate of a mole of an ideal gas at T · 0°C and P · 1atm. Use the mass
of a helium or neon atom for the mass of the ideal gas particle. This requires
a careful investigation of units.A table of fundamental physical constants is
given on the following page.
Solut i on 8 . 3 . 1 The ent ropy of an ideal gas is found in Section 1.3 t o be:
S · kN[ln(V/N ( T)
3
) + 5/2] (8.3.7)
( T ) · (h
2
/2 mkT )
1/2
(8.3.8)
The infor mation content of a microstate is given by Eq. (8.3.2).
722 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 722
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 722
Each of the quantities must be evaluated numerically from appropr iate
tables. A mole of par ticles is
N
0
· 6.0221367 × 10
23
/mole (8.3.9)
At the temper ature
T
0
· 0 °C · 273.13 °K (8.3.10)
kT
0
· 0.0235384 eV (8.3.11)
and pressure
P
0
· 1atm · 1.01325 × 10
5
Pascal · 1.01325 × 10
5
Newton/m
2
(8.3.12)
the volume (of a mole of par ticles) of an ideal gas is:
V · N
0
kT /P
0
· 22.41410 × 10
−3
m
3
/ mole (8.3.13)
the volume per par t icle is:
V/N · 37219.5 Å
3
(8.3.14)
At the same temper ature we have:
(T) · (2 mkT / h
2
)
−1/ 2
· m[AMU ]
−1/ 2
× 1.05633
°
A (8.3.15)
This gives the total information for a mole of helium gas at these conditions
of
I · N
0
(18.5533 + 3/2 ln(m[AMU ])) = 1.24 × 10
25
(8.3.16)
Note that the amount of information per par ticle is only of order 10 bits. 
C o m p l e x i t y o f p h y s i c a l s y s t e m s 723
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 723
Title: Dynamics Complex Systems
Shor t / Normal / Long
hc = 12398.4 eV Å
k = 1.380658x10
-23
Joule/°K
R = kN
0
= 8.3144 Joule/°K/mole
c = 2.99792458 10
8
Meter/second
h = 6.6260755 10
-34
Joule second
e = 1.60217733 10
-19
Coulomb
ProtonMass = 1.6726231x10
-27
kilogram
1 AMU = 1.6605402x10
-27
kilogram = 9.31494x10
9
eV
M [Helium] = 4.0026 AMU
M [Neon] = 20.179 AMU
M [Helium] c
2
= 3.7284x10
9
M [Neon] c
2
=1.87966x10
10
Ta ble 8 . 3 . 1 Fun da me n t a l const a n t s 
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 723
8 . 3 . 2 Algorit hmic complexit y of physica l syst ems
The complexity of a system is designed to measure the amount of infor mation neces-
sary to describe it, or its behavior. In this section we address the key word “necessar y.”
This word suggests that we are after the minimum amount of information. The min-
imum amount of information depends on our capabilities of inference from a smaller
amount of information. As discussed in Section 8.2.2, logical inference and compu-
tation lead to the definition of algorithmic complexity. However, for an ensemble that
can be described simply, the algorithmic complexity is no different than the Shannon
infor mation.
Since we have established a connection between the complexity of physical sys-
tems and representations in ter ms of character str ings, we can apply these results di-
rectly to physical syst ems.A physical system in e quilibrium is r epresented by an e n-
semble. At any par ticular time, it is in a single microstate. The sp ecification o f this
microstate can be compressed by encoding in cer tain rare cases. However, on average
the compression cannot lead to an amount of information significantly different from
the ent ropy (divided by k ln(2)) of the syst em. This conclusion follows because the
microcanonical (or canonical) ensemble can be concisely described. For a nonergodic
system like a glass,the microstate description has been separated into two parts. It is
no longer t rue that the ensemble o f dynamically accessible states o f a par t icular sys-
tem is concisely describable. The information in the frozen degrees of freedom is pre-
cisely the information necessary to specify the ensemble of dynamically accessible
states. The total information, (C + S)/ k ln(2), r epresents the selection of a microstate
from a simple ensemble (microcanonical or canonical). Since the total information
cannot be compressed, neither can either of the two parts of the infor mation—the
frozen degrees o f freedom that we have identified with the complexity, or the addi-
tional information necessary to specify a par t icular microstate. Thus the algor ithmic
complexit y is the same as the infor mation for either part.
We can now, finally, explain the experimental observation that an adiabatic
process does not change the ent ropy of a syst em (Section 1.3). The algor ithmic de-
scription of an adiabatic process r equires only a few pieces of information, e.g., the
size of a force applied over a specified distance. If a new microstate of the system can
be described by the original microstate plus the process of adiabatic change,then the
amount o f information in the microstate has not been changed, and the adiabatic
process does not change the microstate algorithmic complexity—the ent ropy of the
system.Like other aspects of statistical mechanics (Section 1.3),this should not be un-
derstood as a proof but rather as an explanation of the relationship of the thermody-
namic obser vation to the microscopic proper ties. Using this explanation, we can iden-
tify the nature of an adiabatic process as one that is described microscopically by a
small amount of information.
This becomes clearer when we compare adiabatic and irreversible processes.Our
argument that an adiabatic process does not change the entropy is based on consid-
ering the infor mation necessary to describe an adiabatic process—slowly moving a
piston to expand the space available to a gas. An ir reversible process could achieve a
similar expansion, but would not be ther modynamically the same. Take, for example,
724 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 724
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 724
the removal of a partition that separates the gas from a second,init ially empty, cham-
ber. The irrever sible process of expansion of the gas results in a final state which has
a higher entropy (see Question 1.3.4). The removal of a partition in itself does not ap-
pear to require a lot of information to describe.One moment aft er the partition is re-
moved, the ent ropy of the syst em is the same as before. To understand how the en-
t ropy increases, we must consider the nat ure of ir rever sible dynamics.
A key ingredient in our understanding of physical systems is that the time evolu-
tion of an isolated system can be obtained from the simple laws of mechanics (classi-
cal or quantum). This means that the dynamics of an isolat ed syst em conser ves the
amount of information as well as the energy. Such dynamics are called conser vative.
If we consider an ensemble o f systems star t ing in a par t icular region of phase space,
the phase space position evolves in time, but the volume of the phase space that is oc-
cupied—the entropy—does not change. This conservation of phase space can be un-
derstood from our discussion of algorithmic complexit y: since the deterministic dy-
namics of a syst em can be computed, the algorithmic complexity of the system is
conser ved. Where d oes the additional entropy come fr om for the final equilibrium
state after the expansion?
There are two parts to the process of proceeding to a true equilibrium state. In
the first part the distinct ion between the nonequilibrium and equilibr ium state is ob-
scured. At first there is macroscopically obser vable information—the par t icles are in
one half of the chamber. This infor mation is converted to microscopic correlations
between at omic positions and momenta. The conversion occurs when the gas ex-
pands to fill the chamber, and various currents that follow this expansion become
smaller and smaller in ext ent. The microscopic cor relations cannot be obser ved on a
macroscopic scale,and for standard obser vations the system is indistinguishable from
an equilibrium state. The t ransfer of information from macroscopic to microscopic
scale is r elated to issues of chaos in the dynamics of physical syst ems, which will b e
discussed later.
The second part to the process is an actual increase in the entropy of the system.
The additional entropy must come from outside the system. In macroscopic physical
processes, we are not gener ally concer ned with isolating the system from information
t ransfer, only with isolating the system from energy tr ansfer. Thus we can surmise that
the expansion o f the gas is followed by an information t ransfer that enables the en-
t ropy to increase to its equilibrium value without changing the energy of the system.
Many of the issues r elated to describing this nonequilibr ium process will not be ad-
dressed here. We will,however, begin to address the topic of the scale of obser vation
at which correlations appear using the complexit y pr ofile in the following sect ion.
8 . 3 . 3 Complexit y profile
Ge ne ra l a pproa ch In this sect ion we discuss the relationship of microscopic and
macroscopic complexity. Our o bjective is to develop a consistent language for dis-
cussing complexity as a function of length scale. In the following sect ion we will dis-
cuss the complexity as a function of time scale, which gener alizes the discussion of
frozen and fast degr ees of freedom in Sect ion 8.3.1.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 725
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 725
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 725
When we describe a system, we are not gener ally inter ested in a microscopic de-
scription of the positions and velocities of all of the par ticles. For a thermodynamic
system there are only a few macroscopic parameters that we use to describe the sys-
tem. This is indeed the reason we use ent ropy as a summary of the many hidden pa-
r ameter s of the syst em that we are not interested in. The microscopic parameters
change too fast and over too small distances to matter for our macroscopic measure-
ments/experience. The same is t rue more gener ally about systems that are not in equi-
librium: a macroscopic descript ion does not r equire specifying the position of each
atom. This implies that we must d evelop an understanding of complexity that is not
tied to the microscopic descript ion, but is relevant to obser vations at a part icular
length and time scale.
This point lies at the root of a con ceptual probl em in t hinking abo ut the com-
p l ex i t y of s ys tem s . A gas in equ i l i br ium has a large en tropy wh i ch is its micro s cop i c
com p l ex i t y. This is co u n ter to our understanding of com p l ex sys tem s . Sys tems in equ i-
l i br ium are intu i tively simpler than non equ i l i brium sys tems su ch as a human bei n g. In
Secti on 8.3.1 we star ted to ad d ress this probl em by iden ti f ying the com p l ex i t y of a non-
er godic sys tem as the inform a ti on nece s s a ry to specify the frozen degrees of f reedom .
We now discuss a more sys tem a tic approach to dealing with mac ro s copic ob s erva ti on s .
In order to consider the macroscopic complexity, we have to define what we mean
by macroscopic in a formal sense. The concept of macroscopic must be understood
in relation to a part icular obser ver. While we often consider exper imental results to be
independent of the observer, there are various ways in which the obser ver is essential
to the obser vation. In this context, in which we are concerned with the meaning of
macroscopic, considering the observer is essent ial.
How do we ch a racteri ze the differen ce bet ween a micro s copic and a mac ro s cop i c
ob s er ver? The most cr ucial differen ce is that a micro s copic ob s erver is able to disti n-
guish bet ween all inheren t ly disti n g u i s h a ble states of the sys tem , while a mac ro s cop i c
ob s er ver cannot. For a mac ro s copic ob s er ver, m a ny micro s cop i c a lly disti n ct states ap-
pear the same. This is rel a ted to our understanding of com p l ex i ty, because the mac ro-
s copic ob s er ver need on ly specify wh i ch of the mac ro s cop i c a lly dist i n ct states the sys-
tem is in. The micro s copic ob s er ver must specify wh i ch of t he micro s cop i c a lly disti n ct
s t a tes t he sys tem is in. Thus the mac ro s copic com p l ex i ty must alw ays be small er than
the micro s copic com p l ex i ty of a sys tem . In s te ad of con s i dering a unique mac ro s cop i c
ob s er ver, we wi ll con s i der a sequ en ce of ob s er vers wit h a progre s s ively poorer abi l i ty
to distinguish micro s t a te s . Using these ob s er vers , we wi ll define t he com p l ex i t y prof i l e .
I de a l ga s These ideas can be direct ly app l i ed to the ideal ga s . We gen era lly think abo ut
a mac ro s copic ob s erver as having an inabi l i ty to distinguish fine-scale distance s . Thu s
we ex pect that the usual uncert a i n t y in parti cle po s i ti on ∆x wi ll increase for a mac ro-
s copic ob s erver. However, we learn from qu a n tum mechanics that a unique micro s t a te
of the sys tem is def i n ed using an uncert a i n t y in both po s i ti on and mom en tu m , ∆x∆p
·h. Thus for the mac ro s copic ob s erver to confuse disti n ct micro s t a te s , the produ ct ∆x∆p
must be larger than its minimum va lue—an ob s er va ti on of the sys tem provi des mea-
su rem ents of the po s i ti on and mom en tum of e ach parti cl e , whose uncert a i n ty has a
produ ct gre a ter than h. We can label our ob s er ver s by this uncert a i n ty, wh i ch we call
˜
h.
726 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 726
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 726
If we retr ace our steps to the calculation of the entr opy of an ideal gas
(Question 1.3.2), we can recognize that essentially the same calculation applies to the
complexity with the uncer tainty
˜
h. An obser ver with the uncer tainty
˜
h will determine
the complexity of the ideal gas according to Eq.(8.3.7) and Eq.(8.3.8), with h replaced
by
˜
h. Thus we define the complexity profile for the ideal gas in equilibrium as:
(8.3.17)
This equation describes a complexity that decreases as the ability of the observer t o
distinguish states decreases. This is as we expected. Despite the weak logarithmic de-
pendence on
˜
h , C(
˜
h) decreases rapidly because the coefficient of the logarithm is so
large. By the time
˜
h is about 100 times h the complexity profile has become negative
for the ideal gases descr ibed in Question 8.3.1.
What does a negative complexity mean? It actually means that we have not done
the calculation quite right. The counting of states we did for the ideal gas assumed that
the par ticles were well separat ed fr om each other. If they b egin to overlap then we
must count the possible states differently. This over lap is significant precisely when
Eq.(8.3.17) becomes negative. If the particles really overlapped then quantum statis-
tics b ecomes imp or tant; the gas is said to be degenerate and satisfies either Fer mi-
Dir ac or Bose-Einst ein statistics. In our case the overlap arises only because the o b-
server cannot distinguish differ ent par ticle positions. In this case, the counting of
states is appropr iate to a classical ideal gas, as we now explain.
To calculate the complexity as a function of
˜
h for an equilibrium state whose en-
t ropy is S, we start by calculating the number of microstates that the observer cannot
distinguish. The logarithm of this number of microstates, which we call S(
˜
h)/k ln(2),
is the amount of infor mation necessary to specify a microstate, if the macrostate is
known. Thus we have that:
(8.3.18)
To count the number of microstates that the observer cannot distinguish,we note that
the possible microstates of a par ticular par ticle are grouped together by the obser ver
into bins (r egions or cells o f position and momentum) of size ( ∆x∆p)
d
·
˜
h
d
, where
d · 3 is the dimensionality of space. The obser ver deter mines only that a particle is
within a cer tain region. In the classical ideal gas each par ticle moves ind ependently,
so more than one particle may occupy the same microstate. However, this is unlikely.
As
˜
h increases it becomes increasingly likely that there is more than one par ticle in a
region. If the number of part icles in a cer tain region is n
i
, then the number of distinct
microstates of the bin that the obser ver does not distinguish is:
(8.3.19)
wher e g · (
˜
h /h)
d
is the number of microstates within a r egion. This is the product of
the number of states each particle may be in, cor rected for particle indistinguishabil-
it y. The number of microstates of the whole system that appear to the observer to be
the same is the product of such terms for each region:

g
n
i
n
i
!

C(
˜
h ) ·S −S(
˜
h )

˜
h >h

C(
˜
h ) ·S −3kN ln(
˜
h / h)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 727
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 727
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 727
(8.3.20)
From this we can deter mine the complexity of the state deter mined by the obser ver
as:
(8.3.21)
If we consider this expression when g · 1—a microscopic obser ver—then n
i
is almost
always either zero or one and each term in the product is one (a more exact treatment
requires treating the statistics of a degenerate gas). Then C (
˜
h) is S, which means that
the microstate complexity is just the entr opy. For g > 1 but not t oo large, n
i
will still
be either zer o or one, and we r ecover Eq. (8.3.17). On the other hand, using this ex-
pression it is possible to show that for a large value of g, when the values of n
i
are sig-
nificantly larger than one, the complexity goes to zero.
We can understand this by r ecognizing that as g increases, the number of par t i-
cles in each bin increases and becomes closer to the average number of par ticles in a
bin according to the macroscopic probability distribution. This is the equilibrium
macrostate. By our conventions we are measuring the amount of infor mation neces-
sary f or the observer to specify its observation in relation to the equilibrium state.
Therefor e, when the average number of particles in a bin becomes close enough to this
distr ibut ion,t here is no infor mation that must be given. To write this explicitly, when
n
i
is much larger than one we apply Ster ling’s approximation to the factorial in
Eq. (8.3.21) to obtain:
(8.3.22)
where P
i
· n
i
/g is the probability a part icle is in a part icular state according to t h e ob-
s er ver. It is shown in Quest i on 8.3.2 t hat C (
˜
h) is zero wh en P
i
is t he equ i l i br iu m
prob a bi l i ty for finding a part i cle in r egi on i ( n o te t hat i stands for both po s i ti on and
m om en tum (x, p) ) .
There are additional smaller terms in Sterling’s ap proximation to the factor ial
that we have neglected. These t erms are gener ally igno red in calculations of the en-
t ropy because they are not propor tional to the number of par t icles. They are, how-
ever, relevant to calculations of the complexit y:
(8.3.23)
The additional t erms are r elated to fluct uations in the d ensit y. This will become ap-
parent when we analyze nonunifor m systems below.
We will discuss additional examples of the complexity profile below. First we sim-
plify the complexity profile for obser vers that measure only the positions and not the
momenta of par ticles.

g
n
i
n
i
!
i
∏
728 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 728
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 728
Q
ue s t i on 8 . 3 . 2 Show that Eq.(8.3.22) is zero when P
i
is the equilibrium
probability of locating a particle in a par ticular state id entified by mo-
mentum p and position x. For simplicity assume that all g states in the cell
have essentially the same position and momentum.
Solut i on 8 . 3 . 2 We calculate an expression for P
i
→ P(x,p) using
Boltzmann probabilit y for a single par ticle (since all are independent):
(8.3.24)
where Z is the one par t icle par tition function given by:
(8.3.25)
We evaluate the expression:
(8.3.26)
which, by Eq.(8.3.22), we want to show is the same as the ent ropy. Since all
g states in cell i have essential ly the same position and momentum, this is
equal to:
(8.3.27)
which is most readily evaluated by recognizing it as:
(8.3.28)
which is S as given in Eq. (8.3.7). 
Pos i t i on wi t hout mome nt um The use of the scale parameter ∆x∆p in the above
discussion should t rouble us, because we do not gener ally consider the momentum
uncer tainty on the macroscopic scale. The resolution of this problem arises because
we have assumed that the system has a known energy or temperature. If we know the
temperature then we know the thermal velocity or momentum:
∆p ≈ √mkTi (8.3.29)
It does not make sense to have a mom en t um uncer t a i n ty of a par t i cle that is mu ch
gre a ter t han this. Using ∆x∆p · h t his means there is also a natu r al uncer t a i n ty in po-
s i ti on wh i ch is t he ther mal wavel engt h given by Eq . ( 8 . 3 . 8 ) . This is the maximal
qu a n tum po s i ti on uncert a i n t y, unless the ob s er ver can distinguish the thermal mo-
t i on of i n d ivi dual par ti cl e s . We can now t hink abo ut a sequ en ce of ob s er ver s who do
not distinguish the mom en t um of p a rt i cles (they have a larger uncert a i n t y than t he
t h ermal mom en tum) but have increasing uncert a i n ty in po s i ti on given by L ·∆x, or
g · (L/ )
d
. For su ch ob s er vers t he equ i l i brium mom en tum prob a bi l i t y distr i but i on

P( x, p) ·NZ
−1
e
−p
2
/ 2mkT
C o m p l e x i t y o f p h y s i c a l s y s t e m s 729
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 729
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 729
is to be assu m ed . In t his case t he nu m ber of p a r ti cles in a cell n
i
con tr i butes a term
to the en t ropy that is equal to the en tr opy of a gas wit h t his many part i cles in the vo l-
ume L
d
. This gives a tot al en t ropy of :
(8.3.30)
and the complexity is:
(8.3.31)
which differs in form from Eq. (8.3.22) only in the constant.
While we generally do not think about measuring momentum, we do measur e
velocit y. This follows fr om the content of the previous paragraph. We consider ob-
servers that measure particle positions at differ ent times and from this they may infer
the velocity and indirectly the momentum. Since the observer measures n
i
, the deter-
mination of velocity d epends on the obser ver’s ability to distinguish moving spatial
density variations. Thus we consider the measurement of n(x,t), where x has macro-
scopic meaning as a granular coordinate that has discrete values separat ed by L. We
emphasize,however, that this descript ion of a space- and time-dependent density as-
sumes that the local momentum distribution of the system is consistent with an equi-
libr ium ensemble. The more fundamental description is given by the dist ribut ion of
particle positions and momenta, n
i
· n( x,p). Thus, for example, we can also describe
a rotating disk that has no macroscopic changes in density over time, but the rotation
is still macroscopic. We can also describe fluid flow in an incompressible fluid. In this
section we continue to rest rict ourselves to the description of obser vations at a par-
ticular time. The time dependence of obser vations will be considered in Section 8.3.5.
Thus far we have consider ed syst ems that are in generic states selected fr om the
equilibr ium ensemble. Equilibr ium systems are uniform on all but ver y microscopic
scales, unless we are exactly at a phase transition. Thus, most of the complexity dis-
appears on a scale that is far smaller than typical macroscopic obser vations. This is
not necessarily true about nonequilibrium systems. Syst ems that are in states that are
far from equilibrium can have nonuniform densities of particles.A macroscopic ob-
server will see these macroscopic variations. We will consider a couple of different ex-
amples of nonequilibrium states to illustr ate some proper ties of the complexity pro-
file. Before we do this we need to consider the effect of algorithmic compression on
the complexity profile.
Algori t hmi c comple xi t y a nd e rror To discuss macroscopic complexity more com-
pletely, we turn to algor ithmic complexity as a funct ion of scale. The complexity of a
system,par ticularly a nonequilibr ium system,should be defined in ter ms of the algo-
r ithmic complexity of its description. This means that patterns that are present in the
positions (or momenta) of its par ticles can be used to simplify the description.
Using this discussion we can reformulate our understanding of the complexit y
profile. We defined the profile using obser vers with progressively poorer ability to dis-
tinguish microstates. The fr action of the ensemble occupied by these states defined

C( L) ·S −k
i
∑
n
i
ln( g /n
i
) +5/ 2
( )

S(L) ·k
i
∑
n
i
ln(L
d
/n
i
3
) +5 / 2
|
.
`
,
730 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 730
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 730
the complexity. Using an algorithmic perspective we say, equivalently, that the ob-
ser ver cannot distinguish the t rue state from a state that has a smaller algorithmic
complexity. An obser ver with a value of g · 2 cannot distinguish which of two states
each par ticle occupies in the real microstate. Let us label the single par ticle states us-
ing an ind ex that enumerates them. We can then imagine a checkerboard (in six di-
mensions of position and momentum) where odd ind exed states are black and even
ones are white. The observer cannot tell if a particle is in a black or a white state. Thus,
no matter what the real state is,there is a simpler state where only odd (or only even)
indexed states of the par ticles are occupied, which cannot be distinguished from the
real system by the observer. The algorithmic complexity of this state with particles in
odd indexed states is essentially the complexity that we determined above, C(g · 2)—
it is the information necessary to sp ecify this state out of all the states that have par-
ticles only in odd indexed states. Thus,in ever y case, we can specify the complexity of
the syst em for the obser ver as the complexity o f the simplest state that is consistent
with the obser vations—by Occam’s razor, this is the state that the obser ver will use to
descr ibe the system.
We note that this is also equivalent to defining the complexity profile as the length
of the description as the er ror allowed in the descript ion increases. The total er ror as
a function of g for the ideal gas is
(8.3.32)
where N is the number of par ticles in the system. The factor of 1/2 arises because the
average er ror is half of the maximum er ror that could occur. This approach is helpful
since it suggests how to generalize the complexity profile for systems that have differ-
ent types of par ticles. We can define the complexity profile as a function of the num-
ber of err ors that are made. This is better than using a par t icular length scale, which
implies a different error for part icles of different mass as indicated by Eq.(8.3.8). For
conceptual simplicit y, we will continue to wr ite the complexity profile as a function
of g or of length scale.
None qui li bri um s t a t e s Our next object ive is to consider none quilibrium states.
When we have a nonequilibr ium state,the microstate of the system is simpler than an
equilibr ium state to begin with. As we mentioned at the end of Section 8.3.2,there are
nonequilibrium states that cannot be distinguished from equilibrium states on a
macroscopic scale. These nonequilibr ium states have microscopic cor relations. Thus,
the microscopic complexity is lower than the equilibrium entropy, while the macr o-
scopic complexity is the same as in equilibrium:
C(g) < C
0
(g) · S
0
g · 1
C( g) · C
0
(g) g >> 1
(8.3.33)
where we use the subscript 0 to indicate quantities of the equilibrium state. We illus-
t rate this by an example. Using the indexing o f single par ticle states we just intro-
duced, we take a microstate where all par ticles are in odd indexed states. The mi-

1
2
log ∆x
i
∆p
i
/ h
∏
( ) ·
1
2
N log(g)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 731
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 731
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 731
crostate complexity is the same as that of an equilibrium state at g · 2, which is less
than the ent ropy of the equilibr ium system:
C(g · 1) · C
0
(g · 2) < C
0
(g · 1)
However, the complexity of this system for scales of obser vation g ≥ 2 is the same as
that of an equilibr ium system—macroscopic obser vers do not distinguish them.
This scenario, where the complexity of a nonequilibrium state starts smaller but
then quickly becomes equal to the equilibrium state complexity, does not always hold.
It is true that the microscopic complexity must be less than or equal to the entropy of
an equilibr ium syst em, and that all systems have the same complexity when L is the
size of the syst em. However, what we will show is that the complexity of a nonequi-
librium system can be higher than that of the equilibrium syst em at large scales that
are smaller than the size of the syst em. This is apparent in the case, for example, of a
nonuniform densit y at large scales.
To illust r ate what happens for such a nonequilibr ium state, we consider a system
that has nonuniformity that is char acteristic of a par ticular length scale L
0
, which is
significantly larger than the microscopic scale but smaller than the size of the sys-
tem. This means that n
i
is smooth on finer scales,and there is no par ticular relation-
ship between what is going on in one region of length scale L
0
and another. The val-
ues of n
i
will be taken from a Gaussian dist ribution around the equilibrium value n
0
with a standard deviation of . We assume that is larger than the natural density
fluctuations, which have a standard deviation of
0
·√n
0
. For convenience we also as-
sume that is much smaller than n
0
.
We can calculate both the complexity C(L), and the apparent ent ropy S(L) for
this syst em. We start by calculating them at the scale L
0
. C(L
0
) is the amount of in-
formation necessary to sp ecify the d ensity values. This is the product of the number
of cells V/L
d
times the infor mation in a number selected from a Gaussian distribution
of width . From Quest ion 8.3.3 this is:
(8.3.34)
The number of microstates consistent with this macrostate at L
0
is given by the sum
of ideal gas entropies in each region:
(8.3.35)
Since is less than n
0
, this can be evaluated by expanding to second order in n
i
·
n
i
− n
0
:
(8.3.36)
wher e S
0
is the entropy of the equilibrium system, and we used < n
2
i
> ·
2
. We note
that when ·
0
the logarithmic t er ms in the complexity reduce to the extra t erms

S(L
0
) ·S
0
−k
( n
i
)
2
2n
0
i
∑
·S
0
−
kV
2
2L
0
d
n
0

S(L
0
) · −k
i
∑
n
i
ln(n
i
/ g) +( 5/ 2)kN

C( L
0
) ·k
V
L
0
d
(
1
2
(1 +ln(2 )) +ln )
732 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 732
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 732
found in Eq. (8.3.23). Thus, these t erms are the infor mation needed to describe the
equilibrium fluctuat ions in the densit y.
We can understand the beh avi or of t he com p l ex i ty profile of this sys tem . By con-
s t ru cti on , the minimum amount of i n form a ti on needed to specify the micro s t a te is
C( ) · S(L
0
) + C( L
0
) . This is the sum over the en tropy of equ i l i brium gases with den-
s i ties n
i
in vo lumes L
d
0
, p lus C( L
0
) . Si n ce S(L
0
) is linear in the nu m ber of p a r ti cl e s , wh i l e
C(L
0
) is loga rithmic in and therefore loga rithmic in the nu m ber of p a r ti cl e s , we con-
clu de that C(L
0
) is mu ch small er than S(L
0
) . For L > the com p l ex i t y profile C(L) de-
c reases like that of an equ i l i brium ideal ga s . The term S(L
0
) is el i m i n a ted at a micro-
s copic length scale larger t han but mu ch small er than L
0
. However, C(L
0
) rem a i n s .
Due to t his ter m the com p l ex i ty crosses that of an equ i l i brium gas to become larger.
For lengt h scales up to L
0
the com p l ex i ty is essen ti a lly constant and equal to Eq .( 8 . 3 . 3 4 ) .
Above L
0
it dec reases to zero as L con ti nues to increase by vi r tue of the ef fect of com-
bining the different n
i
i n to fewer regi on s . Com bining the regi ons re sults in a Gaussian
d i s tri but i on with a standard devi a ti on that dec reases as the squ a re root of the nu m ber
of ter ms → ( L
0
/L)
d / 2
. Thu s , the com p l ex i ty and en tropy profiles for L > L
0
a re :
(8.3.37)
This expression continues to be valid until there is only one region left,and the com-
plexity goes to zero. The precise way the complexity goes to zero is not describ ed by
Eq. (8.3.37), since the Gaussian distribution does not apply in this limit.
There are several comments that we can make that are relevant to understanding
complexity profiles in general. First we see that in order for the macroscopic com-
plexity to be higher than that in equilibrium, the ent ropy at the same scale must be
reduced S(L) < S
0
. This is necessary because the sum S(L) + C(L)—the total informa-
tion necessary to specify a microstate—cannot be greater than S
0
. However, we also
note that the reduction in S(L) is much larger than the increase in C(L). The ratio be-
tween the two is given by:
(8.3.38)
For >
0
· √n
0
this is greater than one. We can understand this result in two ways.
First, a complex macroscopic system must be far from equilibr ium, and therefore
must have a much smaller entropy than an equilibrium system. Second, a macro-
scopic observer makes many errors in determining the microstate,and therefore if the
microstate is similar to an equilibr ium state,the obser ver cannot distinguish the two
and the macroscopic proper ties must also be similar to an equilibrium state. For ever y
bit of information that distinguishes the macrostate, there must be many bits of dif-
fer ence in the microstate.

S( L)
C( L)
· −
2
2n
0
L
d / 2
L
0
d / 2
1
ln( /
0
)

S(L) ·S
0
−
kV
2
2(LL
0
)
d /2
n
0

C( L) ·k
V
L
d
(
1
2
(1 +ln(2 )) +ln L
0
L
( )
d / 2
)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 733
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 733
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 733
In calculating the com p l ex i ty of the sys tem at a particular scale, we assu m ed that
the ob s erver was in error in obtaining the po s i ti on and mom en tum of e ach par ti cl e .
However, we assu m ed that the nu m ber of p a r ti cles wit hin each bin was determ i n ed ex-
act ly. Thus the com p l ex i ty we calculated is the inform a ti on nece s s a r y to specify the nu m-
ber of p a rti cles precise to the single par ti cl e . This is why even the equ i l i brium den s i ty
f lu ctu a ti ons were de s c ri bed . An altern a tive , m ore re a s on a bl e , a pproach assumes that
p a r ti cle co u n ting is also su bj ect to error. For simplicit y we can assume that the er ror is
a fr acti on of the nu m ber of p a rti cles co u n ted . For mac ro s copic sys tems this fr acti on is
mu ch larger than the equ i l i brium flu ctu a ti on s , wh i ch therefore need not be de s c ri bed .
This approach also modifies the for m of the com p l ex i ty profile of the nonu n i form ga s
in Eq .( 8 . 3 . 3 7 ) . The error in measu rem ent increases as n
0
(L) ∝ L
d
with the scale of ob-
s erva ti on . Let ting m
0
( L) be the error in a measu rem ent of p a r ti cle nu m ber, we wri te :
(8.3.39)
The consequence of this modification is that the complexity decreases somewhat
more rapidly as the scale o f observation increases. The expression for the ent ropy in
Eq. (8.3.37) is unchanged.
Q
ue s t i on 8 . 3 . 3 What is the information in a number (character) se-
lected from a Gaussian distr ibution of standard deviation ?
Solut i on 8 . 3 . 3 Start ing from a Gaussian distribution (Eq. 1.2.39),
(8.3.40)
we calculate the infor mation (Eq. 8.2.2):
(8.3.41)
where the second term in the integral can be evaluated using < x
2
> ·
2
.
We note that this result is to be int er preted as the infor mation in a dis-
crete distr ibution of integr al values of x, like a random walk,that in the limit
of large gives a Gaussian dist r ibut ion. The units that are used to measur e
define the precision to which the values of x are to be described. It thus
makes sense that the information to specify an integer of typical magnitude
is essentially log( ). 
8 . 3 . 4 Time dependence—cha os a nd t he complexit y profile
Ge ne ra l a pproa ch In describing a syst em, we are int erested in macroscopic obser-
vations over time, n(x, t). As with the uncertainty in position,a macroscopic obser ver
is not able to distinguish the time of obser vation within less than a certain time in-

I · − dxP( x) log(P( x))
∫
· dxP( x) log( 2 ) + ln(2)x
2
/ 2
2
|
.

`
,

∫
·log( 2 ) +ln(2) / 2

P( x) ·
1
2
e
−x
2
/ 2
2

C( L) ·k
V
L
d
(
1
2
(1 +ln(2 )) +ln
L
0
d/ 2
n
0
(L)L
d / 2
) ≈k
V
L
d
ln
L
0
3 d / 2
n
0
( L
0
)L
3d / 2
734 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 734
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:52 AM Page 734
terval T · ∆t. To define what this means, we say that the system is r epresented by an
ensemble with probabilit y P
L,T
( n(x; t)), or more generally P
L,T
(n(x, p; t)). The differ-
ent microstates that occur during the time inter val T are all part of this ensemble. This
may appear different than the definition we used for the spatial uncer tainty. However,
the definitions can be restated in a way that makes them appear equivalent. In this re-
statement we r ecognize that the obse rver performs measurements that are, in effect,
averages over various possible microscopic measurements. The average measure-
ments over space and time represent the system (or system ensemble) that is to be de-
scribed by the obser ver. This representation will be discussed further in Section 8.3.6.
The use of an ensemble is convenient b ecause the obser ver may only measure one
quantity, but we can consider various quantities that can be measured using the same
degree of precision. The ensemble represents all possible measurements with this de-
gree of precision. For example, the observer can measure cor relations between par t i-
cle positions that are fixed over time. If we aver aged the densit y n(x, t) over time,these
cor relations could disappear because of the movement of the whole system. However,
if we average over the ensemble,they do not. We define the complexity profile C(L, T )
as the amount of information necessary to specify the ensemble P
L,T
(n( x, t)). A de-
scription at a finer scale contains all of the information necessary to describe the
coarser scale. Thus, C(L, T ) is a monotonic decreasing function of its arguments. A
direct analysis is discussed in Question 8.3.4. We start,however, by consider ing the ef-
fect on C(L, T ) of pr ediction and the lack of predictability in chaotic dynamics.
Pre di ct a bi li t y a nd cha os As discussed earlier, a key ingredient in our understand-
ing of physical syst ems is that the time evolution of an isolat ed syst em (or a system
whose interactions with its environment are specified) can be obtained from the sim-
ple laws of mechanics star ting from a complete microscopic descript ion of the posi-
tion and momenta of the par ticles. Thus, if we use a small enough L and T, so that
each par ticle can be distinguished, we only need to specify P
L,T
(n( x, t)) over a shor t
per iod of time (or the simultaneous values of position and momentum) in order to
predict the behavior o ver all subsequent times. The laws o f mechanics are also re-
ver sible. We describe the past as well as the future from the description of a system at
a part icular time. This must mean that information is not lost over time. Systems that
do not lose information over time are called conservative systems.
However, when we increase the spatial scale of obser vation, L, then the informa-
tion loss—the complexity r eduction—also limits the pr edictability of a syst em. We
are not guarant eed that by kno wing P
L, T
( n(x, t)) at a scale L we can p redict the sys-
tem behavior. This is t rue even if we are only concerned about predicting the behav-
ior at the scale L. We may need additional smaller-scale information to describe the
time evolution of the syst em. This is p recisely the origin of the study of chaotic sys-
tems discussed in Section 1.1. Chaotic syst ems take information fr om smaller scales
and bring it to larger scales. Chaotic syst ems may be contrasted with dissipat ive sys-
tems that take information from larger scales to smaller scales. If we per turb (disturb)
a dissipative system,the eff ect disappears over time. Looking at such a system at a par-
ticular time, we cannot tell if it was pertur bed at some time far enough in the past.
Since the information on a microscopic scale must be conser ved, we know that the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 735
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 735
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 735
information that is lost on the macroscopic scale must be preserved on the micro-
scopic scale. In this sense we can say that information has been t ransferred from the
macroscopic to the microscopic scale. For such syst ems, we cannot describe the past
from present information on a par ticular length scale.
The degree of predictability is manifest when we consider that the complexity of
a system C(L, T ) at a part icular L and T depends also on the duration of the descrip-
tion—the limits of t ∈[t
1
, t
2
]. Like the spatial ext ent of the system, this temporal ex-
tent is part of the system definition. We typically keep these limits constant as we vary
T to obtain the complexity profile. However, we can also characterize the dependence
of the complexity on the time limits t
1
, t
2
by determining the rate at which inf or ma-
tion is either gained or lost for a chaotic or stable system. For complex syst ems, the
flow of information between length scales is bidir ectional—even if the total amount
of information at a particular scale is preser ved, the inf ormation may change over
time by transfer to or from shor ter length scales. Unlike most theories of current s,in-
formation currents remain relevant even though they may be equal and opposite. All
of the infor mation that affects behavior at a par t icular length scale,at any time over
the duration of the descr ipt ion, should be included in the complexity.
It is helpful to develop a conceptual image of the flow of infor mation in a system.
We begin by considering a conser vat ive, nonchaotic and nondissipat ive system seen
by an obser ver who is able to distinguish 2
C(L)/ k ln(2)
· e
C( L) / k
states. C(L) / k ln(2) is the
amount of infor mation necessary to describe the system during a single time interval
of length T. For a conservative syst em the amount of information necessary to de-
scribe the state at a particular time d oes not change over time. The dynamics o f the
system causes the state of the system to change over time among these states. The se-
quence of states could be descr ibed one by one. This would require
N
T
C( L) / k ln(2) (8.3.42)
bits, where N
T
· (t
2
− t
1
) /T is the number of time inter vals. However, we can also de-
scribe the state at a part icular time (e.g.,the initial conditions) and the dynamics. The
amount of information to do this is:
(C(L) + C
t
(L,T ) ) /k ln(2) (8.3.43)
C
t
(L,T ) /k ln(2) is the information needed to describe the dynamics. For a nonchaotic
and nondissipative system we can show that this information is quite small. We know
from the previous section that the macrostate of the system of complexit y C(L) is con-
sistent with a microstate which has the same complexit y. The microstate has a dy-
namics that is simple,since it follows the dynamics of standard physical law. The dy-
namics of the simple microstate also describes the dynamics of the macrostate, which
must ther efore also be simple. Therefore Eq.(8.3.43) is smaller than Eq.(8.3.42) and
the complexity is C(L,T ) · C( L) + C
t
( L,T ) ≈ C(L). This holds for a system following
conser vative, nonchaotic and nondissipative dynamics.
For a system that is chaotic or dissipative, the pict ure must be modified to ac-
commodate the flow of information between scales. From the previous paragraph we
conclude that all of the interesting (complex) dynamics of a system is provided by in-
736 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 736
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 736
formation that comes fr om finer scales. The obser ver does not see this information
before it appears in the state of the system—i.e.,in the dynamics. If we allow ourselves
to see the finer-scale information we can track the flow of information that the ob-
server does not see. In a conventional chaotic system,the flow of information can be
characterized by its Lyaponov exponents. For a system that is described by a single real
valued par ameter, x(t), the Lyaponov exponent is defined as an aver age over :
h · ln((x′(t ) − x( t) )/ ( x′(t − 1) − x( t − 1))) (8.3.44)
wher e unpr imed and p rimed coordinates indicate two different t rajectories. We can
readily see how this affects the information needed by an obser ver to describe the dy-
namics. Consider an obser ver at a par ticular scale, L. The obser ver sees the system in
state x(t − 1) at time t − 1, but he d etermines x(t − 1) only within a bin of width L.
Using the dynamics of the system that is assumed to be known, the obser ver can de-
termine the state of the system at the next time. This ext r apolation is not p recise, so
the obser ver needs additional infor mation to specify the next location. The amount
of infor mation needed is the lo garithm of the number of bins that one bin expands
into during one time step. This is precisely h / ln(2) bits of infor mation. Thus, the
complexit y of the dynamics for the observer is given by:
C(L,T ) · C( L) + C
t
(L,T ) + N
T
kh (8.3.45)
where we have used the same notation as in Eq. (8.3.43).
A physical system that has many dimensions,like the microscopic ideal gas, will
have one Lyaponov exponent for each of 6N dimensions of position and momentum.
If the dynamics is conser vative then the sum over all the Lyaponov exponents is zero,
(8.3.46)
where ∆x
i
(t) · x′
i
(t) −x
i
(t) and ∆p
i
(t) · p′
i
(t) −p
i
(t). This follows directly from conser-
vation of volumes of phase space in conser vative dynamics. However, while the sum
over all exponents is zer o, some of the exponents may be positive and some negative.
These cor respond to chaotic and dissipative modes of the dynamics. We can imagine
the flow of information as consisting of two st reams, one going to higher scales and
one to lower scales. The complexity of the system is given by:
(8.3.47)
As indicated, the sum is only over positive values.
Two cautionary r emarks about the application of Lyaponov exponents to com-
plex physical systems are necessar y. Unlike many standard models of chaos,a complex
system does not have the same number of degrees of freedom at every scale. The num-
ber of independent bits of information describing the system above a particular scale
is given by the complexity profile, C( L). Thus,the flow of information between scales
should be thought of as due to a number of closed loops that extend from a par ticu-
lar lowest scale up to a par ticular highest scale. As the scale increases,the complexity

C( L,T ) ·C( L) +C
t
( L, T ) + N
T
k h
i
i:h
i
>0
∑

i
∑
h
i
· log(
i
∏
∆x
i
(t )∆p
i
(t ) /
i
∏
∆x
i
(t −T )∆p
i
(t −T )) ·0
C o m p l e x i t y o f p h y s i c a l s y s t e m s 737
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 737
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 737
decreases. Thus, so does the maximum number of Lyaponov exponents. This means
that the sum over Lyaponov exponents is itself a function of scale. More generally, we
must also be concerned that C(L) can be time dependent,as it is in many irreversible
processes.
The second r emark is that over time the cycling of information between scales
may bring the same infor mation back more than once. Eq. (8.3.47) does not distin -
guish this,and therefore may include multiple counting of the same information. We
should understand this expression as an upper bound on the complexit y.
Ti me s ca le de pe nde nce Once we have chaotic b ehavior, we can consider various
descriptions of the time dependence of the behavior seen by a particular observer. All
of the models we considered in Chapter 1 are applicable. The state of the system may
be selected at random from a part icular distribution (ensemble) of states at successive
time int er vals. This is a sp ecial case o f the more general Mar kov chain model that is
described by a set of t ransition probabilities. Long-range correlations that are not eas-
ily described by a Markov chain may also be important in the dynamics.
In order to discuss the complexity pr ofile as a funct ion of T, we consider a
Markov chain model. From the analysis in Question 8.3.4 we learn that the loss of
complexity with time scale occurs as a result of cycles in the dynamics. These cycles
need not be deter ministic; they may be stochastic—cycles that do not repeat indefi-
nitely but rather can occur one or mo re times through the probabilistic selection of
successive states. Thus,a high complexity for large T arises when there is a large space
of states with low chance of repetition in the dynamics. The highest complexity would
arise from a deter ministic dynamics with cycles that are longer than T. This might
seem to contradict our previous conclusion, where the deterministic dynamics was
found to be simple. However, a complex deterministic dynamics can arise if the suc-
cessive states are specified by informat ion from a smaller scale.
Q
ue s t i on 8 . 3 . 4 Consider the information in a Markov chain of N
T
states
at int er vals T
0
given by the transition mat r ix P(s′| s). Assume the com-
plexity of specifying the transition matr ix—the complexity of the dynamics
—C
t
· C(P(s′| s)),is itself small.( See Question 8.3.5 for the case of a complex
deterministic dynamics.)
a. Show that the more d eter ministic the chain is, the less infor mation it
contains.
b. Show that for an obser ver at a longer time scale consisting of two time
steps (T · 2T
0
) the information is reduced. Hint: Use convexity of infor-
mation as described in Question 1.8.8, f ( 〈x〉) > 〈 f( x)〉, for the function
f (x) · −x log(x).
c. Show that the complexity does not d ecrease for a syst em that d oes not
allow 2-cycles.
Solut i on 8 . 3 . 4 When the complexity of the dynamics is small, then the
complexit y of the Mar kov chain is given by:
C · C(s) + C
t
+ N
T
k ln(2)I(s′| s) (8.3.48)
738 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 738
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 738
where the terms correspond to the infor mation in the initial state of the sys-
tem,the information in the dynamics and the incr emental information per
update needed to specify the next state. The r elationship between this and
Eq.(8.3.47) should be apparent. This expression does not hold if C
t
is large,
because ifit is larger than N
T
C(s),then the chain is more concisely described
by specifying each of the states of the system (see Question 8.3.5).
The proof of ( a) follows from realizing that the more deter ministic the
system is,the smaller is I(s′| s). This may be used to define how deter ministic
the dynamics is.
To analy ze the com p l ex i t y of t he Ma rkov chain for an ob s er ver at ti m e
scale 2T
0
, we need to com bine su cce s s ive sys tem st ates into an unordered
pair—t he en s em ble of s t a tes seen by the ob s erver. We use the not ati on {s′, s}
for a pair of s t a te s . Thu s , we are con s i dering a new Ma rkov chain of tr a n s i-
ti ons bet ween unordered pairs . To analy ze this we need t wo prob a bi l i ti e s :t h e
prob a bi l i ty of a pair and t he tra n s i ti on prob a bi l i ty from one pair to the nex t .
The latter is the new tra n s i ti on matr i x . The prob a bi l i ty of a particular pair is:
(8.3.49)
where P(s) is the p robability of a par t icular state of the syst em and the two
ter ms in the up per line cor respond to the p robability of start ing from s
1
to
make the pair, and star ting fr om s
2
to make the pair. The t ransition matr ix
for pairs is given by
(8.3.50)
which is valid only for s
1
≠ s
2
and for s′
1
≠ s′
2
. Other cases are t reated like
Eq.(8.3.49). Eq.(8.3.50) includes all four possible ways of gener ating the se-
quence of the two pairs. The normalization is needed because the transition
mat rix is the probability of {s′
1
≠ s′
2
} occurr ing, assuming the pair {s
1
, s
2
} has
already occurr ed.
To show (b) we must prove that the process of combining the states into
pairs reduces the infor mation necessary to describe the chain. This is appar-
ent since the obser ver loses the information about the state or der within each
pair. To show it from the equations, we note from Eq.(8.3.49) that the prob-
ability of a particular pair is larger than or equal to the probability of each of
the two possible unordered pairs. Since the probabilities are larger, the in-
formation is smaller. Thus the information contained in the first pair is
smaller for T · 2 than for T · 1. We must show the same result for each suc-
cessive pair. The tr ansition probability can be seen to be an average over two
terms in the round parenthesis. By convexity, the infor mation in the aver age
is less than the average infor mation of each term.Each of the terms is a sum

P({s
1
,s
2
}) ·
P(s
1
| s
2
) P(s
2
) +P(s
2
| s
1
)P(s
1
) s
2
≠s
1
P(s
1
| s
1
)P(s
1
) s
2
·s
1
¹
'
¹
C o m p l e x i t y o f p h y s i c a l s y s t e m s 739
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 739
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 739
over the probabilities of two possible orderings, and is therefore larger than
or equal to the probability of either ordering. Thus,the information needed
to sp ecify any pair in the chain is smaller than the corresponding informa-
tion in the chain of states.
Finally, to prove (c) we note that the less the order of states is lost when
we combine states into pairs, the more complexity is retained. If transitions
in the dynamics can only occur in one direction,then we can infer the order
and information is not lost. Thus, for T · 2 the complexity is retained if the
dynamics is not reversible—there are no 2-cycles. From the equations we see
that if only one of P(s
1
| s
2
) and P(s
2
| s
1
) can be nonzero, and similarly for
P( s′
1
| s′
2
) and P(s′
2
| s ′
1
), then only one term sur vives in Eq. (8.3.49) and
Eq. (8.3.50) and no aver aging is p erformed. For ar bitrar y T the complexity
is the same as at T · 1 if the dynamics does not allow loops of size less than
or equal to T. 
Q
ue s t i on 8 . 3 . 5 Calculate the maximum infor mation that might in prin-
ciple be necessary to specify completely a deterministic dynamics of a
system whose complexity at any time is C( L). Contr ast this with the maxi-
mum complexity of descr ibing N
T
steps of this system.
Solut i on 8 . 3 . 5 The number of possible states of the syst em is 2
C(L) / k ln(2)
.
Each of these must be assigned a successor by the dynamics. The maximum
possible information to sp ecify the dynamics arises if there is no algorithm
that can specify the successor, so that each successor must be identified ou t
of all possible states. This would require 2
C(L) / k ln(2)
C(L) /k ln(2) bits.
The maximum com p l ex i ty of N
T
s teps is just N
T
C(L) , as long as this is
s m a ll er than the previous re su l t . Wh i ch is gen era lly a re a s on a ble assu m pti on . 
A simple example of chaotic behavior that is relevant to complex systems is that
of a mobile system—an animal or human being—where the motion is int er nally di-
rected.A descript ion of the system behavior, even at a length scale larger than the sys-
tem itself, must describe this motion. However, the motion is determined by infor-
mation contained on a smaller length scale just prior to its occurrence. This satisfies
the for mal requirements for chaotic behavior regardless of the specifics of the motion
involved. Stated differently, the large-scale motion would be changed by modifica-
tions of the internal state of the system. This is consistent with the sensitivity of
chaotic mot ion to smaller scale changes.
Another example of information t ransfer b etween different scales is related t o
adaptability, which requires that infor mation about the external environment be rep-
resented in the organism. This gener ally involves the transfer of infor mation between
a larger scale and a smaller scale.Specifically, between observed phenomena and their
representation in the synapses of the ner vous system.
When we describe a system at a part icular moment of t ime,the complexity of the
system at its own scale or larger is zero—or a constant if we include the descript ion of
the equilibrium system. However, when we consider the descript ion of a system over
740 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 740
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 740
time, then the complexity is larger due to the syst em motion. Increasing the scale of
obser vation continues to result in a progressive decrease in complexity. At a scale that
is larger than the system itself, it is the motion of the system as measured by its loca-
tion at successive time intervals that is to be described. As the scale becomes larger,
smaller scale motions are not obser ved,and a simpler description of motion is possi-
ble. The obser ver only notes changes in position that are larger than the scale of
obser vation.
A natural question that can be asked in this context is whether the motion of the
system is due to exter nal influences or due to the system itself. For example,a particle
moving in a fluid may be displaced by the motion of the fluid. This should be con-
sidered different from a mobile bacteria. Similar ly, a basketball in a game moves
through its t rajector y not because of its own volition, but rather because of the voli-
tion of the player s. How do we distinguish this from a syst em that mo ves due to its
own act ions? More generally, we must ask how we must deal with the environmental
influences for a system that is not isolated. This question will be dealt with in Section
8.3.6 on behavior al complexity. Before we address this question,in the next section we
discuss sever al aspects of the complexity profile, including the relationship of the
complexit y of the whole to the complexit y of its par ts.
8 . 3 . 5 Propert ies of complexit y profiles of syst ems
a nd component s
Ge ne ra l prope rt i e s We can readily understand some of the properties that we
would expect to find in complexity profiles of systems that are difficult to calculate di-
rectly. Fig. 8.3.2 il lustr ates the complexity profile for a few syst ems. The paragraphs
that follow describe some of their feat ures.
For any syst em, the complexity at the smallest values of L, T is the microscopic
complexity—the amount of infor mation necessary to describe a part icular mi-
crostate. For an equilibr ium state this is the same as the thermodynamic entropy,
which is the ent ropy of a system observed on an arbit rarily long time scale. This is not
t rue in general because short-range corr elations decrease the microstate complexit y,
but do not affect the apparent macroscopic entropy. We have thus also defined the en-
t ropy profile S(L,T) as the amount of information necessary to deter mine an arbitrar y
microstate consistent with the obser ved macrostate. From our discussion of noner-
godic syst ems in Sect ion 8.3.1 we might also conclude that at any scale L, T the sum
of the complexity C(L,T ) and the ent ropy S(L,T ) of the system (the fast degrees of
freedom) should add up to the microscopic complexity or macroscopic ent ropy
C(0,0) ≈ S( ∞,∞) ≈ C(L,T ) + S(L,T ) (8.3.51)
However, this is valid only under special circumstances—when the macroscopic state
is selected at random from the ensemble of macrostates,and the microstate is selected
at random from the possible microstates.A glass may satisfy this requirement; how-
ever, other complex systems need not.
For a t ypical system in equilibrium, as L,T is increased the system rapidly
becomes homogeneous in space and time. Specifically, the d ensity of the system is
C o m p l e x i t y o f p h y s i c a l s y s t e m s 741
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 741
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 741
742 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 742
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(0, T)
(1)
(3)
(2)
T
(4)
C(L,0)
(1,2)
(3)
L
(4)
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 742
uniform in space and time,aside from unobser vable small fluctuations, once the scale
of observation is larger than either the correlation length or the correlation time of
the syst em. Indeed, this might be taken to be the definition of the correlation length
and time—the scale at which the microscopic information becomes irrelevant to the
properties of the system. Beyond the correlation length,the average behavior charac-
ter istic of the macroscopic scale is all that remains,and the complexity profile is con-
stant at all length and time scales less than the size of the system.
We can con tr ast the com p l ex i t y profile of a t herm odynamic sys tem with what we
ex pect from va r ious com p l ex sys tem s . For a gl a s s , the com p l ex i t y profile is qu i te dif-
ferent in time and in space . A t ypical glass is unifor m if L is larger t han a micro s cop i c
cor rel a ti on len g t h . Thu s , the com p l ex i ty profile of the glass is similar to an equ i l i br iu m
s ys tem as a functi on of L. However, it is different as a functi on of T. The frozen degree s
of f reedom t hat make it a non er godic sys tem at typical time scales of ob s er va ti on guar-
a n tee this. At typical va lues of T the tem por al en s em ble of the sys tem inclu des the state s
that are re ach ed by vi bra ti onal modes of t he sys tem , but not t he atomic re a rra n gem en t s
ch a racter i s tic of f luid moti on . Thu s , the atomic vi br a ti ons cannot be ob s er ved except
at micro s copic va lues of T. However, a significant part of the micro s copic de s c ri pti on
remains nece s s a ry at lon ger time scales. Corre s pon d i n gly, a plateau in the com p l ex i ty
profile ex tends up to ch a racteri s tic time scales of human ob s erva ti on . At a tem pera tu re -
depen dent and mu ch lon ger time scale, the com p l ex i ty profile declines to its therm o-
dynamic limit. This time scale, t he rel a x a ti on t i m e , is acce s s i ble near the glass tra n s i-
ti on tem pera tu re . For lower tem pera tu res it is not. Because the glass is uniform in space ,
the plateau should be rel a t ively flat and end abru pt ly. This is because spat ial uniform i t y
i n d i c a tes that the rel a x a ti on t ime is essen ti a lly a local proper ty with a narrow distri b-
uti on . A more ex ten ded spatial coupling would give rise to a grading of the plateau and
a broadening of the t ime scale at wh i ch the plateau disappe a rs .
C o m p l e x i t y o f p h y s i c a l s y s t e m s 743
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 743
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 8 . 3 . 2 Sch e ma t ic plot s of t h e comple xit y profile C( L, T) of four diffe re n t syst e ms.
C( L, T) is t h e a moun t of in forma t ion n e ce ssa ry t o de scribe t h e syst e m e n se mble a s a fun ct ion
of t h e le n gt h sca le, L, a n d t ime sca le, T, of obse rva t ion . Top pa n e l sh ows t h e t ime sca le de -
pe nde n ce , bot t om pa n e l sh ows t h e le n gt h sca le de pe n de n ce . ( 1) An e quilibrium syst e m h a s
a comple xit y profile t h a t is sh a rply pe a ke d a t T · 0 a nd L · 0. On ce t h e le n gt h or t ime sca le
is be yon d t h e corre la t ion le n gt h or corre la t ion t ime re spe ct ive ly, t h e comple xit y is just t he
ma croscopic comple xit y a ssocia t e d wit h t h e rmodyn a mic qua n t it ie s (U, N, V) , wh ich va n ish e s
on a n y re a son a ble sca le . ( 2) For a gla ss t h e comple xit y profile a s a fun ct ion of t ime sca le
C( 0, T) de ca ys ra pidly a t first due t o a ve ra gin g ove r a t omic vibra t ion s; it t h e n re a ch e s a
pla t e a u t h a t re pre se n t s t h e froze n de gre e s of fre e dom. At much lon ge r t ime sca le s t h e com-
ple xit y profile de ca ys t o it s t h e rmodyn a mic limit . Un like C( 0, T) , C( L, 0) of a gla ss de ca ys like
a t h e rmodyn a mic syst e m be ca use it is h omoge n e ous in spa ce . ( 3) A ma gn e t a t a se con d- or-
de r ph a se t ra n sit ion h a s a comple xit y profile t h a t follows powe r- la w be h a vior in bot h le n gt h
a n d t ime sca le . St och a st ic fra ct a ls ca pt ure t h is kin d of be h a vior. ( 4) A comple x biologica l or-
ga n ism h a s a comple xit y profile t h a t sh ould follow simila r be h a vior t o t h a t of a fra ct a l.
Howe ve r it h a s pla t e a u- like re gion s t h a t corre spon d t o crossin g t h e sca le of in t e rn a l compo-
ne n t s, such a s mole cule s a nd ce lls. 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 743
More gen er a lly, for a com p l ex sys tem we ex pect that many para m eters wi ll be re-
qu i red to de s c ri be its proper ties at all lengt h and time scales, at least up to some frac-
ti on of the spatial and tem poral scale of the sys tem itsel f .S t a r ting from the micro s cop i c
com p l ex i ty, the com p l ex i t y profile should not be ex pected to fall smoo t h ly. In bi o l og-
ical or ga n i s m s , we can ex pect that as we increase the scale of ob s er va ti on ,t h ere wi ll be
p a r ticular length scales at wh i ch details wi ll be lost. P l a teaus in t he profile are rel a ted
to t he ex i s ten ce of well - def i n ed levels of de s c ri pt i on . For ex a m p l e , an iden ti f i a ble level
of cellular beh avi or would cor re s pond to a plate a u , because over a ra n ge of l ength scales
l a r ger than the cell , a full acco u n ting of cellular properti e s , but not of t he inter nal be-
h avi or of the cell , must be given . Th ere are many cells that have a ch a racter i s tic size
and are immobi l e . However, because different cell pop u l a ti ons have different sizes and
s ome cells ar e mobi l e , the sharpness of the tra n s i ti on should be smoo t h ed . We can at
least qu a l i t a tively iden t ify sever al different plate a u s . At the shortest time scale t he atom i c
vi bra ti ons wi ll be avera ged out to end the fir st plate a u .L a r ger atomic moti ons or mol-
ecular beh avi or wi ll be aver a ged out on a secon d ,l a r ger scale. The internal cellular be-
h avi or wi ll t hen be aver a ged out . F i n a lly, the internal beh avi or of t i s sues and or ga n s
wi ll be avera ged out on a sti ll lon ger length and t ime scale. It is the degrees of f reedom
that remain rel evant on t he lon gest length scale that are key to the com p l ex i t y of t h e
s ys tem . These degrees of f reedom manifest the con cept of em er gent co ll ective beh av-
i or. Ul ti m a tely, t h ey must be tr ace a ble back to t he micro s copic degrees of f reedom .
De s c ri bing the con n ecti on bet ween the micro s copic para m eter s and mac ro s cop i c a lly
rel evant para m eters has occ u p i ed our atten ti on in mu ch of this boo k .
Mathematical models that best capture the complexity profile of a complex sys-
tem are fractals (see Section 1.10). Mathematical fr actals with no granularity (no
smallest length scale) have infinite complexity. However, if we define a smallest length
scale, cor responding to the at omic length scale o f a physical syst em, and we define a
longest length scale that is the size of the syst em, then we can plot the spatial com-
plexity p rofile of a fr actal-like syst em. There are two quite distinct kinds of mathe-
matical fr actals, deter ministic and stochastic fr actals. The d eterministic fractals ar e
specified by an algor ithm with only a few parameters,and thus their algorithmic com-
plexity is small. Examples are the Kant or set or the Sier pinski gasket. The algorithm
describes how to create finer and finer scale detail. The only difficulty in specifying the
fr actal is sp ecifying the number o f levels to which the algorithm should be iterated.
This infor mation (the number of iter ations) requires a parameter whose length grows
logarithmically with the ratio of the size of the syst em to the smallest length scale.
Thus, a d eterministic fr actal has a complexity p rofile that decreases logarithmically
with obser vation length scale L, but is very small on all length scales.
Stochastic fr actals are qualitat ively different. In such fr actals, there are rand om
choices made at ever y scale of the str uctur e.St ochastic fractals can be based upon the
Kant or set or Sierpinski gasket, by including random choices in the algor ithm. They
may also be systems representing the spatial structure of various stochastic processes.
Such a syst em requires infor mation to describe its structure on ever y length scale. A
stochastic fractal is a member of an ensemble,and its algorithmic as well as ensemble
complexity will scale as a power law of the scale of obser vation L. As L increases, the
744 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 744
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 744
amount of information is reduced, but there is no length scale smaller than the size of
the system at which it is completely lost. Time series that have fr actal behavior—that
have p ower-law cor relations—would also display a power-law d ependence o f their
complexity profile as a function of T. The simplest physical model that demonst r ates
such fractal proper ties in space and time is an Ising model at its second-order tr ansi-
tion point. At this t r ansition there are fluct uations on all spatial and t emporal scales
that have power-law behavior in both. Observers with larger values of L can see the
behavior of the correlations only on the longer length scales. A renor malization
t reatment, discussed in Section 1.10, can give the value of the complexity profile.
These examples illustrate how microscopic infor mation may become ir relevant on
larger length scales, while leaving collective information that r emains relevant at the
longer scales.
The complexity profile enables us to consider again the d efinition of a complex
system. As we stated, it seems intuit ive that a complex syst em is complex on many
scales. This st rengthens the identification of the fractal model of space and time as a
central model for the understanding of complex systems. We have also gained an un-
derstanding of the difference between deterministic and stochastic fractal systems. We
see that the glass is complex in its t empor al behavior, but not in its spatial behavior,
and therefore is only a part ial example of a complex system. If we want to identify a
unique complexity of a system, there is a natural space and time scale at which to de-
fine it. For the spatial scale, L
s
, we consider a significant fraction of the system—one-
tenth of its size. For the t emporal scale, T
s
, we consider the r elaxation (autocor rela-
tion) time of the behavior on this same length scale. This is essential ly the maximal
complexity for this length scale, which would be the same as setting T · 0. However,
we could also take a natural time scale of T
s
· L
s
/ v
s
where v
s
is a characteristic veloc-
ity of the syst em. This form makes the increase in time scale for larger length scales
(systems) apparent. Leaving out the time scale,since it is dependent on the space scale,
we can wr ite the complexity of a system s as
C
s
· C
s
( L
s
) · C
s
(L
s
, L
s
/ v
s
) ≈ C
s
(L
s
, 0) (8.3.52)
In Section 1.10 we discussed generally the scaling of quantities as a funct ion of
the precision to which we describe the system.One of the central questions in the field
of complex systems is understanding how complexity scales. This scaling is con-
cretized by the complexity profile.One of the object ives is to understand the ultimate
limits to complexit y. Given a par t icular length or time scale, we ask what is the max-
imum possible complexity at that scale.One could say that this complexity is limited
by the thermodynamic entr opy; however, there are further limitations. These limita-
tions are established by the nature of physical law that establishes the dynamics and
interactions of the components. Thus it is unlikely that atoms can be attached to each
other in such a way that the behavior of each atom is relevant to the spatiot empor al
behavior of an organism at the length and time scale relevant to a human being. The
details of behavior must be lost as we obser ve on longer length and time scales; this
results in a loss o f complexit y. The complexity scaling of complex organisms should
follow a line like that given in Fig. 8.3.2. The highest complexity of an organism results
C o m p l e x i t y o f p h y s i c a l s y s t e m s 745
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 745
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 745
from the retention o f the greatest significance of details. This is in contrast to ther-
modynamic systems, where all of the degrees of freedom average out on a ver y shor t
length and time scale. At this time we do not know what limits can be placed on the
rate of decrease of complexit y with scale.
Compone nt s a nd sys t e ms As we discussed in Ch a pter 2, a com p l ex sys tem is form ed
o ut of a hiera rchy of i n terdepen dent su b s ys tem s . Thu s , rel evant to va rious qu e s ti on s
a bo ut the com p l ex i t y profile is an understanding of the com p l ex i ty that may arise wh en
we bring toget h er com p l ex sys tems to form a larger com p l ex sys tem . In gen eral it is not
clear that bri n ging toget h er many com p l ex sys tems must give rise to a co ll ective com-
p l ex sys tem . This was discussed in Ch a pter 6, wh ere one example was a flock of a n i-
m a l s . Here we can provi de ad d i ti onal meaning to this statem ent using the com p l ex i t y
prof i l e . We wi ll discuss the rel a ti onship of the com p l ex i ty of com pon ents to the com-
p l ex i t y of the sys tem they are par t of . To be def i n i te , we can con s i der a flock of s h eep.
The example is ch o s en to expand our vi ew tow a rd more gen eral app l i c a ti on of t h e s e
i de a s . The gen er al statem ents we make app ly to any sys tem form ed out of su b s ys tem s .
Let us assume that we know the complexity of a sheep, C
sheep
(L
sheep
), the amount
of information necessary to describe the relevant behaviors of eating, walking, repro-
ducing, flocking, etc.,at a length scale of about one-tenth the size of the sheep. For our
current pur poses this might be a lot of information contained in a large number of
books, or a little information contained in a single paragr aph of text .Later, in Section
8.4, we will obtain an estimate of the complexity as, of order, one book or 10
7
bits.
We now consider a flock of N sheep and construct a description of this flock. We
begin by taking infor mation that describes each of the sheep. Combining these de-
scriptions, we have a description of the flock. This information is,however, highly re-
dundant. Much of the infor mation that describes one sheep can also be used to de-
scribe other sheep. Of course there are differences in size and in behavior. However,
having described one sheep in detail we can describe the differences, or we can de-
scribe general char acteristics of sheep and then sp ecialize them for each of the indi-
vidual sheep. Using this strat egy, a descrip tion o f the flo ck will be shorter than the
sum of the lengths of the descrip t ions of each of the sheep. Still, this is not what we
really want. The descrip tion of the flock behavior has to be on its own length scale
L
flock
, which is much larger than L
sheep
. So we shift our observation of behavior to this
longer length scale and find that most of the details of the individual sheep behavior
have become irrelevant to the descript ion of the flock. We describe the flock behavior
in terms of sheep densit y, grazing activit y, migration, reproductive rates, etc. Thus we
writ e that:
C
flock
· C
flock
(L
flock
) << C
flock
(L
sheep
) << NC
sheep
(L
sheep
) · NC
sheep
(8.3.53)
where N is the numb er o f sheep in the flock. Among other conclusions, we see that
the complexity of a flock may actually be smaller than the complexit y of one sheep.
More generally, the r elationship b etween the complexity of the collective com-
plex system and the complexity of component systems is crucially dependent on the
existence of coherence and cor relations in the behavior of the components that can
arise either from common origins for the behavior or from interactions between the
746 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 746
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 746
components. We first describe this qualitatively by consider ing the two inequalities in
Eq. (8.3.53). The second inequality arises because differ ent sheep have the same be-
havior. In this case their behavior is coherent. The first inequality arises because we
change the scale of obser vation and so lose the behavior of an individual sheep. There
is a t rade-off between these two inequalities. If the behaviors of the sheep are ind e-
pendent,then their behavior cannot be observed on the longer scale.Specifically, the
movement of one sheep to the right is canceled by another sheep that starts at its right
and moves to the left. Thus, only corr elated motions of many sheep can be obser ved
on a longer scale.On the other hand,if their behaviors are cor related,then the com-
plexity of describing all o f them is much smaller than the sum of the separate com-
plexities. Thus, having a large collect ive complexity r equires a balance between d e-
pendence and independence of the behavior of the components.
We can discuss this more quantitatively by consider ing the example of the
nonuniform ideal gas. The loss of information for uncorrelated quantities due to
combining them together is described by Eq.(8.3.37). To const ruct a model where the
quantities are cor related, we consider placing the same densities in a region of scale
L
1
> L
0
. This is the same model as the p revious one, but now on a length scale of L
1
.
The new value of is
1
· (L
1
/ L
0
)
d
. This increase of the standard deviation causes
an increase in the value of the complexity for all scales great er than L
1
. However, for
L < L
1
the complexity is just the complexity at L
1
, since there is no structure below this
scale. A comparative plot is given in Fig. 8.3.3.
We can come closer to considering the behavior of a collection of animals by con-
sidering a model for their motion. We start with a scale L
0
just larger than the animal,
so that we do not describe its internal str uct ure—we describe only its location at suc-
cessive inter vals of time. The characteristic time over which a sheep moves a distance
L
0
is T
0
. We will use a mo del for sheep motion that can illust rate the effect of coher-
ence of many sheep, as well as the effect of coherent motion of an individual sheep
over time. To do this we assume that an indi vidual sheep moves in a st r aight line for
a distance qL
0
in a time qT
0
before choosing a new direction to move in at random.
For simplicity we can assume that the direct ion chosen is one of the four compass di-
rections, though this is not necessary for the analysis. We will use this model to cal-
culate the complexity profile of an individual sheep. Our treatment only describes the
leading behavior of the complexity profile and not var ious corrections.
For L · L
0
and T · T
0
, the complexity of describing the motion is exactly 2 bits
for ever y q steps to deter mine which of the four possible directions the sheep will
move next. Because the movement is in a str aight line, and the changes in direction
are at well-defined int er vals, we can r econstr uct the motion fr om the measurements
of any obser ver with L < qL
0
and T < qT
0
. Thus the complexity is:
C(L,T ) · 2N
T
/ q L < qL
0
, T < qT
0
(8.3.54)
Once the scale of obser vation is greater than qL
0
, the observer does not see ever y
change in direction. The she ep is moving in a random walk where ea ch st ep has a
length qL
0
and takes a time qT
0
, but the obser ver does not see each st ep. The distance
t raveled is proport ional to the square root of the time,and so the sheep moves a dis-
C o m p l e x i t y o f p h y s i c a l s y s t e m s 747
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 747
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 747
tance L once in ever y (
0
/ L)
2
steps, where
0
· qL
0
is the standard deviation of the
r andom walk in each dimension. Ever y time the sheep t ravels a distance L we need 2
bits to describe its motion, and thus we have a complexity:
(8.3.55)
We note that at L · qL
0
Eq. (8.3.54) and Eq. (8.3.55) are equal.
To obtain the complexity profile for long times scales T > qT
0
, but shor t length
scales L < qL
0
, we use a simplified “blob” picture to combine the successive positions
of the sheep into an ensemble of positions. For T only a few times qT
0
we can expect
that the ensemble would enable us to reconst ruct the motion—the complexity is the
same as Eq.(8.3.54). However, eventually the ensemble of positions will over lap and
form a blob. At this point the movement of the sheep will be described by the move-
ment of the blob, which itself undergoes a random walk. The standard d eviation of
this random walk is propor tional to the square root of the number of steps:

T <qT
0

L >qL
0
,

C( L,T ) ·2
N
T
q
0
2
L
2
· 2N
T
qL
0
2
L
2
748 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 748
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(L)
L
d
0 5 10 15 20
0.2
0.4
0.6
0.8
1
1.2
(1)
(2)
Fi gure 8 . 3 . 3 Plot of t h e comple xit y of a n on un iform ga s ( Eq. ( 8. 3. 37) ) , for t wo ca se s. The
first ( 1) h a s a corre la t ion in it s n on un iformit y a t a sca le L
0
a n d t h e se con d ( 2) a t a sca le
L
1
> L
0
. Th e ma gn it ude of t h e loca l de via t ion s in t h e de n sit y a re t h e sa me in t h e t wo ca se s .
Th e se con d ca se h a s a lowe r comple xit y a t sma lle r sca le s but a h igh e r comple xit y a t t h e la rge r
sca le s. Be ca use t h e comple xit y de cre a se s ra pidly wit h sca le , t o sh ow t h e e ffe ct s on a lin e a r
sca le L
1
wa s t a ke n t o be on ly
3
√10L
0
, a n d t h e h orizon t a l a xis is in un it s of L
3
me a sure d in un it s
of L
3
0
. Eq ( 8. 3. 39) would give simila r re sult s but t h e comple xit y would de ca y st ill more ra pidly. 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 748
·
0
√T/qT
0
. Since this is larger than L, the amount of information is essentially that
of selecting a value from a Gaussian dist ribution of this standar d deviation:
L < , T > qT
0
(8.3.56)
There are a few points to be made about this exp ression. First, we use the minimum
of two values to sele ct the crossover point between the b ehavior in Eq. (8.3.54) and
the blob b ehavior. As we mentioned ab ove, the blob behavior only o ccurs for T sig-
nificantly greater than qT
0
. The simplest way to id entify the crossover point is when
the new estimate of the complexity becomes lower than our previous value. The sec-
ond point is that we have chosen to adjust the constant ter m added to the logarithm
so that when L · the complexity matches that given by Eq.(8.3.55), which describes
the behavior when L becomes large. Thus the limit on Eq.(8.3.55) should be general-
ized to L > . This minor adjustment enables the complexity to be continuous despite
our rough approximations, and does not change any of the conclusions.
We can see from our results (Fig. 8.3.4) how var ying q affects the complexity.
Increasing q decreases the complexity at the scale of a sheep, C( L,T ) ∝ 1/q in
Eq. (8.3.54). However, it increases the complexity at longer scales C(L,T ) ∝ q in
Eq.(8.3.55). This is a straight forward consequence of increasing the coherence of the
motion over time. We also see that the complexity at long times decays inversely pro-
port ional to the time but is relat ively insensitive to q. The value of q primarily affects
the crossover point to the long time behavior.
We now use two different assumptions to calculate the complexity of the flock. If
the mo vement o f all o f the sheep is coher ent, then the complexity of the flo ck for
length scales greater than the size of the flock is the same as the complexity of a sheep
for the same length scales. This is apparent because describing the movement of a sin-
gle sheep is the same as describing the entire flock. We now see the significance of in-
creasing q. Increasing q increases the flock complexity until qL
0
reaches L
1
, wher e L
1
is the size of the flock. Thus we can increase the complexity of the whole at the cost of
reducing the complexit y of the components.
If the movement of sheep are independent of each other, then the flock displace-
ments—the displacements of its center of mass—are of char acteristic size / √N (see
Eq.5.2.21). We might be concer ned that the flock will disperse. However, as in our dis-
cussions of polymers in Sect ion 5.2, inter actions that would keep the sheep t ogether
need not affect the motion of their center of mass. We could also int roduce into our
model a circular r eflect ing boundary (a moving pen) around the flock, with its cen-
ter at the center of mass. Since the motion of the sheep with this boundary does not
require additional information over that without it,the complexity is the same. In ei-
ther case, the complexity of flock motion (L > L
1
) is obtained as:
(8.3.57)

L >

C( L,T ) ·2N
T
qL
0
2
NL
2

C( L,T ) ·2
N
T
q
min(1,
qT
0
T
(1 +log(
L
))
·2
N
T
q
min(1,
qT
0
T
(1 +log(
L
0
L
qT
T
0
))
C o m p l e x i t y o f p h y s i c a l s y s t e m s 749
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 749
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 749
This is valid for all L if is less than L
1
. If we choose T to be ver y large, Eq. (8.3.56)
applies, with replaced by / √N. We see that when the motion of sheep are indepen-
dent,the flock complexity is much lower than before—it decreases inver sely with the
number of sheep when L > . Even in this case, however, increasing q increases the
flock complexity. Thus coherence in the behavior of a single sheep in time, or coher-
ence between different sheep, increases the complexity of the flock. However, the
maximum complexity of the flock is just that of an individual sheep, and this arises
only for coherent behavior when all mo vements are visible on the scale of the flock.
Any movements of an individual sheep that are smaller than the scale of the flock dis-
appear on the scale of the flock. Thus even for coherent motion, in general the flock
complexit y is smaller than the complexit y of a sheep.
This example illustrates the effect of coherent behavior. However, we see that
even with coherent motion the complexity of a flock at its scale cannot be larger than
the complexity of the she ep at its own scale. This is a problem for us, because our
study of complex syst ems is focused up on syst ems whose complexity is larger than
their components. Without this possibilit y, there would be no complex syst ems. To
obtain a higher complexity of the whole we must modify this model. We must assume
750 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 750
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(L)
L
50 100 150 200
0.2
0.4
0.6
0.8
1
q=50;T=1
q=50;T=500
q=100;T=1
q=100;T=500
Fi gure 8 . 3 . 4 Th e comple xit y profile is plot t e d for a mode l of t h e move me n t of sh e e p a s pa rt
of a flock. I n cre a sin g t h e dist a n ce a sh e e p move s in a st ra igh t lin e ( coh e re n ce of mot ion in
t ime ) , q, de cre a se s t h e comple xit y a t sma ll le n gt h sca le s a n d in cre a se s t h e comple xit y a t la rge
le n gt h sca le s. Solid lin e s a n d da sh e d lin e s sh ow t h e comple xit y profile a s a fun ct ion of le n gt h
sca le for a t ime sca le T · 1 a nd T · 500 re spe ct ive ly. 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 750
more generally that the motion of a sheep is describable using a set of patterns of be-
havior. Coherent motion of sheep still lead to a similar (or lower) complexit y. To in-
crease the complexity, the motion o f the flock must have mo re complex patt erns of
motion. In order to achieve such patterns, the motions of the individual sheep must
be neither independent nor coherent—they must be cor related motions that com-
bine patterns of sheep motion into the more complex patterns of flock motion. This
is possible only if there are interactions between them, which have not been included
here. It should now be clear that the objective of learning how the complexity of a sys-
tem is related to the complexity of its components is central to our study of complex
systems.
Q
ue s t i on 8 . 3 . 6 Throughout much of this book our working definition
of complex systems or complex organisms as articulated in Section 1.3
and developed further in Chapter 2 was that a complex system has a behav-
ior that is dependent on all of its parts. In par ticular, that it is impossible to
take part of a complex organism away without affect ing the behavior of the
whole and behavior of the part. How is this d efinition related to the defini-
tion of complexity ar ticulated in this sect ion?
Solut i on 8 . 3 . 6 Our quantitative concept of complexity is a measure of the
infor mation necessary to describe the system behavior on its own length
scale. If the system behavior is complex,then it must require many parame-
ters to describe. These parameters are r elated to the description of the sys-
tem on a smaller length scale, where the parts of the system are manifest be-
cause we can distinguish the descript ion of one part from another. To d o
this we limit P
L, T
(n(x, t)) to the domain of the part. The behavior of a sys-
tem is thus related to the behavior of the parts. The more these are relevant
to the system b ehavior, the greater is the system complexit y. The infor ma-
tion that describes the system b ehavior must be relevant on ever y smaller
length scale. Thus, we have a direct relationship between the definition of a
complex syst em in t erms o f parts and the definition in ter ms of informa-
tion. Ultimately, the information necessary to describe the system behavior
is d etermined by the microscopic description of atomic positions and mo-
tions. The more complex a system is, the more its behavior depends on
smaller scale components. 
Q
ue s t i on 8 . 3 . 7 When we defined int erdependence we did not consider
the dependence of an animal on air as a relevant example. Explain.
Solut i on 8 . 3 . 7 We can now recognize that the use of infor mation as a
char acterization of behavior enables us to distinguish various forms of de-
pendency. In par t icular, we see that the dependence of an animal on air is
simple, since the necessary proper ties of air are simple to describe. Thus,
the degree of interdependence of two syst ems should be measured as the
amount o f infor mation necessary to replace one in the description of the
other. 
C o m p l e x i t y o f p h y s i c a l s y s t e m s 751
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 751
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 751
8 . 3 . 6 Beha viora l complexit y
Our ability to describe a system arises fr om measurements or observations of its be-
havior. The use of system descriptions to define system complexity does not dir ectly
take this into account. The complexity profile brought us closer by acknowledging the
obser ver in the space and time scale of the description. By acknowledging the scale of
obser vation, we obtained a mechanism for distinguishing complex systems from
equilibrium systems,and a systematic method for characterizing the complexity of a
system. There is another approach to reaching the complexity pr ofile that incorpo-
rates the obser ver and syst em relationship in a mo re satisfactor y manner. It also en-
ables us to consider directly the interaction of the system with its environment, which
was not included pr eviously. To introduce the new approach, we ret urn to the under-
pinning of descr iptive complexit y and pr esent the concept of behavior al complexity.
In Shannon’s approach to the study of infor mation in communication syst ems,
there were two quantities of fundamental interest. The first was the infor mation con-
tent of an individual message, and the second was the aver age information provided
by a particular source. The discussion of algor ithmic complexity was based on a con-
sider ation of the infor mation provided by a par ticular message—specifically, how
much it could be compressed. This car r ied over into our discussion of physical sys-
tems when we introduced the microscopic complexity of a system as the infor mation
contained in a particular microscopic realization of the system. When all messages, or
all syst em states, have the same probability, then the information in the part icular
message is the same as the average infor mation, and we can wr ite:
(8.3.58)
The exp ression on the right, however, has a different pur pose. It is a quantity that
characterizes the ensemble rather than the individual microstate. It is a char acteriza-
tion of the source r ather than of any par ticular message.
We can pursue this line of reasoning by considering more carefully how we might
char acter ize the source of the information, rather than the messages.One way to char-
acterize the sour ce is to d etermine the average amount of infor mation in a message.
However, if we want to describe the source to someone, the most essential informa-
tion is to give a descript ion of the kinds of messages that will be received—the en-
semble of possible messages. Thus to character ize the source we need a description of
the pr obability of each kind o f message. How much infor mation do we need to d e-
scr ibe these probabilities? We call this the behavior al complexit y of the sour ce.
A few examples in the context of a source of messages will ser ve to illustrate this
concept. Any descript ion of a source must assume a language that is to be used. We
assume that the language consists of a list of char acters or messages that can be re-
ceived from the source, along with their probabilities.A delimiter (:) is used to sepa-
rate the messages from their probability. For convenience, we will write probabilities
in decimal notation. A second delimit er (,) is used to separate different members of
the list.A source that gives zeros and ones at random with equal probability would be
described by {1:0.5,0:0.5}. It is convenient to include the length of a message in our

I ({x, p}| (U , N ,V )) · −logP({x, p}) · − P({x, p})
{x,p }
∑
log(P({x, p}))
752 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 752
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 752
description of the source. Thus we might describe a source with length N · 1000 char-
acter messages, each char acter zero and one with equal p robabilit y, as: {1000(1:0.5,
0:0.5)}. The message complexity of this source would be given by N, the length o f a
message. However, the behavioral complexity is given by (in this language): two dec-
imal digits,two character s (1, 0),the number representing N (requiring log(N) char-
acters) and several d elimiter s. We could also specify an ASCII language source by a
table of this kind that would consist of 256 elements and the probabilities of their oc-
currence in some database. We see that the behavioral complexity is quite distinct
from the complexity of the messages provided by a source. In par t icular in the above
example it can be larger, if N · 1, or it can be much smaller, if N is large.
This definition of the behavioral complexity of a source runs into a minor prob-
lem, because the probabilities are real numbers and would gener ally require arbitr ar y
numbers of digits to describe. To overcome this problem,t here must be a convention
assumed about the limit of precision that is desired in describing the source. In prin-
ciple,this precision is related to the number of messages that might be received. This
convention could be part of the language, or could be defined by the specification it-
self. The description of the source can also be compressed using the p rinciples of al-
gorithmic complexity.
As we found above,the behavioral complexity can be much smaller than the in-
formation complexity of a particular message—if the sour ce provides many random
digits, the complexity of the message is high but the complexity of the sour ce is low
because we can characterize it simply as a source of r andom number s. However, if the
probability of each message must be independently specified, the behavioral com-
plexity of a sour ce is much larger than the infor mation content of a par t icular mes-
sage. If a part icular message requires N bits of infor mation,then the number of pos-
sible messages is 2
N
. Listing all of the possible messages requires N 2
N
bits, and
specifying each probability with Q bits would give us a total of (N + Q)2
N
bits to de-
scribe the source. This could be reduced if the messages are placed in an agreed-upon
order ; then the number of bits is Q2
N
. This is still exponentially larger than the infor-
mation in a par ticular message. Thus, the complexity of an ar bitr ary sour ce of mes-
sages of a particular length is much larger than the complexity of the messages it sends.
We are int erested in the behavioral complexity when our objective is to use the
messages that we receive to understand the source, rather than to make use of the in-
formation itself. Behavioral complexity becomes particularly useful when it is smaller
than the complexity of a message, because it enables us to anticipate or pr edict the be-
havior of the source.
We now apply these thoughts about the source as the syst em of interest, rather
than the message as the syst em of interest, to a discussion of the propert ies of physi-
cal systems. To make the connection between source and syst em, we consider an ob-
server of a physical system who performs a number of measurements. We might
imagine the measurements to consist of subjecting the system to light at various fr e-
quencies and measuring their scatter ing and reflect ion (looking at the syst em), ob-
servations of animals in the wild or in capt ivit y, or physical probes of the system. We
consider each measurement to be a message from the syst em to the obser ver. We must,
C o m p l e x i t y o f p h y s i c a l s y s t e m s 753
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 753
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 753
however, take note that any measurement consists of two parts,the conditions or en-
vironment in which the obser vation was p erformed and the behavior of the system
under these conditions. We write any observation as a pair (e,a), where e represents
the environment and a represents a measur ement of system p roper ties (action) un-
der the circumstances of the environment e. The obser ver, after per for ming a number
of measur ements, writes a description of the obser vations. This description char ac-
terizes the syst em. It captures the propert ies of the list of measurements, rather than
of one part icular measurement. It may or may not explicitly contain the information
of each measur ement. Alternat ively, it may assign probabilities to a particular mea-
surement. We would like to define the behavioral complexity as the amount of infor-
mation contained in the obser ver’s description. However, we must be careful how we
do this because of the pr esence of the environmental descr ipt ion e.
In order to clarify this point, and to make contact between behavioral complex-
ity and our previous discussion of descriptive complexity, we first consider the phys-
ical syst em of interest to be essentially isolat ed. Then the environmental description
is irrelevant, and an obser vation consists only of the system measurement a. The list
of measurements is the set {a}. In this case it is relatively easy to see that the behav-
ior al complexity of a physical system is its descript ive complexity—the set of all mea-
surements char acter izes completely the state of the system.
If the entire set o f measurements is p er for med at a single instant, and has ar bi-
t rary precision, then the behavior al complexity is the microstate complexity o f the
system. The result of any measurement can be obtained from a description of the mi-
crostate, and the set of possible measurements determines the microstate.
For a set of measurements p erformed over time on an equilibrium syst em, the
behavior al complexity is the ensemble complexity—the number of parameters nec-
essary to specify its ensemble. A par ticular message is a measurement of the syst em
properties, which in pr inciple might be detailed enough to determine the instanta-
neous positions and momenta of all of the par ticles. However, the list of measure-
ments is determined by the ensemble of states the system might have. As in
Section 8.3.1, we conclude that the complexity of an equilibr ium syst em is the com-
plexity of describing its ensemble—specifying (U, N,V) and other parameters like
magnetization that result from the breaking of ergodicit y. For a glass, the ensemble
information is the information in the frozen coordinates p reviously d efined as the
complexit y. More generally, for a set of measurements perfor med over an int er val of
time T—or at one instant but with time determination error T—and with spatial po-
sition determination errors given by L, we recover the complexity profile.
We now r eturn to c onsider a syst em that is not isolat ed but subject to an envi-
ronmental influence so that an obser vation consists of the pair (e,a) (Fig. 8.3.5). The
complexity of describing such messages also contains the complexity of the environ-
ment e. Does this mean that our system descript ion must include its environment and
that the complexity of the system is dependent on the complexity of the environment?
Complex systems or simple systems inter act and respond to the environment in
which they are found. Since the system response a is dependent on the environment
e, there is no doubt that the complexity of a is dependent on the complexity of e. Three
754 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 754
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 754
examples illustrate how the environmental influence is important. The tail o f a dog
has a par ticular motion that can be described, and the complexity can be character-
ized. However, we may want to attr ibute much of this complexity to the rest of the dog
r ather than to the tail. Similar ly, the motion of a particle suspended in a liquid follows
Brownian motion, the description of which might be better att r ibuted to the liquid
than to the par ticle. Clearer yet is the example of the behavior of a basket ball during
a basketball game. These examples generalize to the consideration of any system, be-
cause measuring the proper ties o f a syst em in an environment may cause us to be
measuring the influence of the environment, rather than the system. The obser ver
must describe the syst em behavior as a resp onse to a par ticular environment, rather
than just the behavior itself. Thus, we do not characterize the syst em by a list of ac-
tions {a} but rather by the list of pairs {( e,a)} wher e our concern is to describe f the
funct ional mapping a · f ( e) from the environment e to the response a. Once we real-
ize this, we can again affirm that a full microscopic description of the physical system
is enough to give all system responses. The p oint is that the complexity of a syst em
should not include the complexity of the influence upon it, but just the complexity of
its resp onse. This response is a proper ty of the syst em and is determined by a com-
plete microscopic description. Conversely, a full description of behavior subject to all
possible environments would require complete microscopic information.
However, within a range of environments and with a desired degree of precision
(spatial and tempor al scale) it is possible to provide less information and still describe
the behavior. We consider the ensemble of messages (measurements) to have possible
times of obser vation over a range of times given by T and errors in position determi-
nation L. Describing the ensemble of responses g ives us the behavioral complexity
profile C
b
(L,T ).
When the influence of the environment is not important, C(L,T ) and C
b
(L,T )
are the same. When the environment matters,it is also impor tant to characterize the
infor mation that is relevant about the environment. This is related to the problem of
C o m p l e x i t y o f p h y s i c a l s y s t e m s 755
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 755
Title: Dynamics Complex Systems
Shor t / Normal / Long
message (action)
System
Observer
System's
Environment
e
a
e
Fi gure 8 . 3 . 5 Th e obse rva t ion of syst e m be h a vior in volve s me a sure me n t s bot h of t h e syst e m’s
e n viron me n t , e , a n d t h e syst e m’s a ct ion s, a , in re spon se t o t h is e n viron me n t Th us we sh ould
ch a ra ct e rize a syst e m a s a fun ct ion , a · f( e ) , wh e re t h e fun ct ion f de scribe s it s a ct ion s in re-
spon se t o it s e n viron me n t . I t is ge n e ra lly simple r t o de scribe a mode l for t h e syst e m st ruc-
t ure, wh ich is a lso a mode l of f, ra t h e r t h a n a list of a ll of it s e n viron me n t - a ct ion ( e , a ) pa irs. 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 755
prediction, because predicting the system behavior in the fut ure requires information
about the environment. As we have defined it,the descriptive complexity is the infor-
mation necessary to p redict the behavior of the system over the time int er val t
2
− t
1
.
We can character ize the environmental influence by gener alizing Eq. (8.3.47) to in-
clude a ter m that describes the rate of information t ransfer from the environment to
the system:
(8.3.59)
where C
e
(L)/k ln(2) is the infor mation about the environment necessary to predict the
state of the system at the next time step, and C
b
(L) is the behavior al complexity at one
time interval. Because the system itself is finite,the amount of information about the
universe that is relevant to the system behavior in any interval of t ime must also be fi-
nite. We note that because the system affects the environment, which then affects the
system, Eq.(8.3.59) as written may count infor mation more than once. Thus,this ex-
pression as wr itten is an upper bound on the complexity. We not ed this point also
with respect to the Lyaponov exponents after Eq. (8.3.47).
This use of behavior/response rather than a description to characterize a system
is related to the use of response funct ions in physics, or input/output relationships to
describe art ificial systems. The response funct ion can (in pr inciple) be completely de-
rived from the microscopic description of a system. It is more directly relevant to the
system behavior in response to environmental influences, and thus is essential for di-
rect compar ison with experimental results.
Behavioral complexity suggests that we should consider the system b ehavior as
represented by a funct ion a · f (e). The input to the function is a descript ion of the
environment; the output is the resp onse or action. There is a difficulty with this ap-
proach in that the complexity of functions is gener ically much larger than that of the
system itself. From the discussion in Section 8.2.3 we know that the description of a
funct ion would require an amount of information given by C
f
· C
a
2
C
e
, where C
e
is the
environmental complexity, and C
a
is the complexity of the act ion. Because the envi-
ronmental influence leads to an exponentially large complexity, it is clear that often
the most compact description of the system behavior will give its str ucture rather
than its resp onse to all inputs. Then, in p rinciple, the response can be derived fr om
the st r ucture. This also implies that the behavior of physical syst ems under different
environments cannot be ind ependent. We note that these conclusions must also ap-
ply to human beings as complex systems that respond to their environment (see
Question 8.3.8).
Q
ue s t i on 8 . 3 . 8 Discuss the following statements with respect to human
beings as complex syst ems: “The most compact descrip tion of the sys-
tem behavior will give its struct ure rather than its response to all inputs,” and
“This implies that the behavior of physical systems under different environ-
ments cannot be independent.”

C( L,T ) ·C
b
( L, T ) +N
T
C
e
(L,T )
C
b
(L,T ) ·C
b
(L) +C
t
(L,T ) +N
T
k h
i
i:h
i
>0
∑
756 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 756
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 756
Solut i on 8 . 3 . 8 The first statem ent is rel evant to the discussion of beh avi or-
ism as an approach to psych o l ogy (see Secti on 3.2.8). It says that the idea of
de s c ri bing human beh avi or by cataloging re acti ons to envi ron m ent al sti mu l i
is ulti m a tely an inef f i c i ent approach . It is more ef fective to use su ch measu re-
m ents to con s tr u ct a model for the internal functi oning of the indivi dual and
use this model to de s c ri be the measu red re s pon s e s . The model de s c ri pti on is
mu ch more concise than the de s c ri pti on of a ll po s s i ble re s pon s e s .
Moreover, from the second stat ement we know that the model can d e-
scribe the responses to circumstances that have not been measured. This also
means that the use of such models may be effective in predict ing the behav-
ior of an individual.Specifically, that reactions of a human being are not in-
dependent of past react ions to other circumstances. A model that incorpo-
rates the p revious behaviors may have some ability to predict the behavior
to new circumstances. This is part of what we do when we interact with other
individuals—we const ruct models that represent their behavior and then
ant icipate how they will r eact to new circumstances.
The coupling between the reaction of a human being under one cir-
cumstance to the reaction under a different circumstance is also relevant to
our understanding of human limitations. Opt imizing the response through
adaptation to a set of environments according to some goal is a process that
is limit ed in its effect iveness due to the coupling between responses to dif-
ferent circumstances. An individual who is eff ective in some circumstances
may have qualities that lead to ineffective behavior under other circum-
stances. We will discuss this in Chapter 9 in the context of consider ing the
specialization of human beings in society. This point is also applicable more
generally to living organisms and their ability to consume resources and
avoid predators as discussed in Chapter 6. Increasing complexity enables an
organism to be more effective, but the effect iveness und er a variet y of cir-
cumstances is limit ed by the int erdependence of responses. This is r elevant
to the obser vation that living organisms generally consume limited types of
resources and live in par ticular ecological niches. 
8 . 3 . 7 The observer a nd recognit ion
The explicit existence of an obser ver in the definition of behavioral complexity en-
ables us to further consider the role of the observer in the definition of complexity.
What assumptions have we made about the propert ies of the obser ver? One of the as-
sumptions that we have made is that the obser ver is more complex than the syst em.
What happens if the complexity of the syst em is greater than the complexity of the
obser ver? The complexity of an obser ver is the number of bits that may be used to de-
scribe the obser ver. If the obser ver is describ ed by fewer bits than are needed to d e-
scribe the syst em, then the obser ver will be unable to contain the description of the
system that is being obser ved. In this case,the obser ver will constr uct a descript ion of
the system that is simpler than the syst em actually is. There are several possible ways
that the observer may simplify the descript ion of the system. One is to reject the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 757
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 757
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 757
obser vation o f all but a few kinds of messages. The other is to art ificially limit the
length of messages described. A third is to tr eat complex variability of the source as
r andom—described by simple probabilities. These simplifications are often done in
our modeling of physical systems.
An inherent pr oblem in discussing behavior al complexity using environmental
influence is that it is never possible to guarantee that the behavior of a system has been
fully char acter ized. For example, a rock can be describ ed as “just sitting there,” if we
want to describe the complexity of its motion under different environments. Of
course the nature o f the environment could be changed so that other behaviors will
be realized. We may, for example,discover that the rock is act ually a camouflaged an-
imal. This is an inherent problem in behavioral complexity: it is never possible to
char acterize with certainty the complexity of a system under circumstances that have
not been measured. All such conclusions are extrapolations. Per forming such extr ap-
olations is an essential part of the use of the description of a system. This is a general
problem that applies to quantitat ive scientific modeling as well as the use of experi-
ence in gener al.
Finally, we describe the relevance of recognition to complexit y. The first
comment is relat ed to the recognition of sets of numbers introduced briefly in
Section 8.2.3. We introduced there the concept of recognition complexity of a set that
relies upon a recognizer (a special kind of TM called a predicate that gives a single bit
output) that can identify the system und er discussion. Specifically, when p resented
with the system it says, “This is it,” and when pr esented with any other system it says,
“This is not it.” We define the complexity of a system (or set o f systems) as the com-
plexity of the simplest recognizer of the system (or set of systems). There are some in-
teresting features of this definition.First we realize that this definition is well suited to
describing classes of systems. A description or model of a class of systems must iden-
tify common attr ibutes rather than specific behaviors.A second interesting feature is
that the complexity of the recognizer depends on the possible univer se of systems that
it can be presented with. For example,the complexity of recognizing cows depends on
whether we allow ourselves to present the recognizer with all domestic animals, all
known biolo gical organisms on earth, all pot entially viable biological organisms, or
all possible systems. Naturally, this is an important issue in the field of pattern recog-
nition, where the complexity of designing a syst em to r ecognize a par ticular patt ern
is st rongly dependent on the universe of possibilities within which the pattern must
be r ecognized. We will return to this point lat er when we consider the p roper ties of
human language in Sect ion 8.4.1.
A different form of complexity related to recognition may be abstr acted from the
Turing test of artificial intelligence. This test suggests that we will achieve an ar tificial
representation of intelligence when it becomes impossible to deter mine whether we
are interacting with an ar tificial or actual human b eing. We can assume that Turing
had in mind only a limited type of int eraction between the obser ver “we” and the sys-
tems b eing obser ved—either the real or artificial r epresentation o f a human being.
This test, which relies upon an obser ver to recognize the system, can ser ve as the ba-
sis for an additional definition of complexit y. We determine the minimal possible
758 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 758
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 758
complexity of a model (simulated representation) of the system which would be rec-
ognized by a part icular observer under par t icular circumstances as the system. The
complexity of this model we call the substitution complexit y. The sensit ivity of this
definition to the nature of the obser ver and the conditions of the obser vation is man-
ifest. In some ways this definit ion,however, is implicit in all of our earlier definitions.
In all cases, the complexity measures the length of a representation of the system.
Ultimately we must determine whether a particular r epresentation o f the syst em is
faithful. The “we” in the p revious sentence is some observer that must recognize the
system behavior in the constructed representation.
We conclude this section by r eviewing some o f the main concepts that were in-
t roduced. We noted the sensitivity of complexity to the spatial and temporal scale rel-
evant to the description or response. The complexity profile formally takes this int o
account. If necessary, we can define the unique complexity of a system to be its com-
plexity profile evaluated at its own scale.A mo re complete characterization of the sys-
tem uses the entire complexity profile. We found that the mathematical models most
closely associated with complexity—chaos and fractals—were both relevant. The for-
mer described the influence of microscopic infor mation over time. The latter de-
scribed the gr adual rather than rapid loss of information with spatial and temporal
scale. We also reconciled the notion of information as a measure of system complex-
ity with the notion of complex systems as composed out of int erdependent parts.Our
next objective is to concretize this discussion further by estimating the complexity of
par ticular systems.
Comple xi t y Es t i ma t i on
Ther e are various difficulties associated with obtaining sp ecific values for the com-
plexity of a par ticular syst em. There are both fundamental and practical p roblems.
Fundamental problems such as the difficulty in determining whether a representation
is maximally compressed are important. However, before this is an issue we must first
obtain a repr esentation.
One ap proach to obtaining the complexity o f a syst em is to construct a repr e-
sentation. The explicit representation should then be used to make a simulation to
show that the system behavior is reproduced. If it is,then we know that the length of
the representation is an upper bound on the complexity of the system. We can hope,
however, that it will not be necessary to obtain explicit representations in order to es-
timate complexities. The objective of this sect ion is to discuss various methods for es-
timating the complexity of systems with which we are familiar. These approaches
make use of repr esentations that we cannot simulate, however, they do have r ecog-
nizable relationships to the system.
Measuring complexity is an exp erimental problem. The only reason that we ar e
able to discuss the complexity of various syst ems is that we have already made many
measurements of the proper ties of various syst ems. We can make use of the existing
information to const ruct estimates of their complexity. A specific estimation method
is not necessar ily useful for all systems.
8 . 4
C o m p l e x i t y e s t i m a t i o n 759
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 759
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 759
Our object ive in this section is limited to obtaining “ballpark” estimates of the
complexity of systems. This means that our er rors will be in the exponent rather
than in the number itself. We would be ver y happy to have an estimate of complexity
such as 10
3t1
or 10
7±2
. When appropr iate, we keep t rack of half-decades using factor s
of three, such as in 3 × 10
4
. These rough estimates will give us a first impression of
the degree of complexity of many of the systems we would like to understand. It
would tell us how difficult (ver y roughly) they are to describe. We will discuss three
methods—(1) use of intuition and human language descriptions, (2) use of a nat -
ural r epresentation tied to the system exist ence, where the principle example is the
genome of living organisms, and (3) use of component counting. Each of these
methods has flaws that will limit our confidence in the resulting estimates. However,
since we are tr ying to find rough estimates, we can still take advantage of them.
Consistency o f different methods will give us some confidence in our estimates of
complexit y.
While we will discuss the complexity of various systems,our focus will be on de-
termining the complexity of a human being. Our final estimate,10
10±2
bits will be ob-
tained by combining the results of different estimation techniques in the following
sections. The implications of obtaining an estimate of human complexity will be dis-
cussed in Section 8.4.4. We start,however, by noting that the complexity of a human
being can be bounded by the physical entropy of the collect ion of atoms from which
he or she is formed. This is roughly the entropy of a similar weight of water, about 10
31
bits. This is the value of S / k ln2. As usual, we have assumed that there is nothing as-
sociated with a human being except the mater ial of which he or she is formed, and
that this material is described by known physical law. This entropy is an upper bound
to the inf ormation necessary to specify the complete human b eing. The meaning o f
this number is that if we take away the person and we replace all of the atoms accord-
ing to a specification of 10
31
bits o f infor mation, we have r eplaced microscopically
each atom where it was. According to our understanding of physical law, ther e can be
no discernible difference. We will discuss the implications for artificial intelligence in
Section 8.4.4, where we consider whether a computer could simulate the dynamics of
atoms in order to simulate the behavior of the human being.
The ent ropy of a human b eing is much larger than the complexity estimate we
are after, because we are interested in the complexity at a relevant spatial and tempo-
r al scale. In general we consider the complexity of a system at the natural scale defined
in Section 8.3.5, one-tenth the size of the system itself,and the relaxation time of the
behavior on this same length scale. We could also define the complexity by the ob-
ser ver. For example,the maximum visual sensitivity of a human being is about 1/100
of a second and 0.1 mm. For either case, obser ving only at this spatial and temporal
scale decreases dramatically the relevance of the microscopic descript ion. The reduc-
tion in information is hard to estimate directly. To estimate the r elevant complexit y,
we must consider other techniques. However, since most of the information in the en-
tropy is needed to describe the position of molecules of water undergoing vibrations,
we can guess that the complexity is significantly smaller than the entr opy.
760 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 760
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 760
8 . 4 . 1 Huma n int uit ion—la ngua ge a nd complexit y
The first method for estimation of complexity—the use of human intuition and lan-
guage—is the least cont rolled/scientific method of obtaining an estimate of the com-
plexity of a system. This approach,in its most basic for m,is precisely what was asked
in Question 8.2.1. We ask someone what they believe the complexity of the system is.
It is assumed that the person we ask is somewhat knowledgeable about the system and
also about the problem of describing systems. Even though it appears highly arbitrar y,
we should not dismiss this ap proach too readily because human beings are designed
to understand complex systems. It could be argued that much of our development is
directed toward enabling us to constr uct predictive models of various parts of the en-
vironment in which we live. The complexity of a system is directly related to the
amount of study we need in order to master or predict the behavior of a system. It is
not accidental that this is the fundamental objective of science—behavior prediction.
We are quite used to using the word “complexity”in a qualitative manner and even in
a comparative fashion—this is mor e complex or less complex than something else.
What is missing is the quantitative definition. In order for someone to give a quant i-
tative estimate of the complexity of a system,it is necessary to provide a definition of
complexit y that can be readily understood.
One useful and intuit ive definition of complexity is the amount of infor mation
necessary to describe the behavior of a system. The information can be quantified in
ter ms of representations people are familiar with—the amount of text / the number of
pages /the number of books. This can be sufficient to cause a person to build a rough
mental model of the system descript ion, which is much more sophisticat ed than
many explicit representations that might be constr ucted. Ther e is an inherent limita-
tion in this approach mentioned more generally ab ove—a human being cannot di-
rectly estimate the complexity of an organism of similar or great er complexity than a
human being. In particular, we cannot use this approach directly to estimate the com-
plexity of human beings. Thus we will focus on simpler animals first. For example, we
could ask the question in the following way: How much text is necessary to describ e
the behavior of a frog? We might emphasize for clarification that we are not interested
in comparative frogology, or molecular frogology. We are just interested in a descrip-
tion of the behavior of a frog.
To gain additional confidence in this approach, we may go to the library and find
descriptions that are provided in books. Superficially, we find that there are entire
books devoted to a par ticular t ype of insect (mosquito, ant, butterfly), as there ar e
books devoted to the tiger or the ape. However, there is a qualitative difference be-
tween these books. The books on insects are devoted to comparative descriptions,
where various t ypes of, e.g., mosquitoes, from around the world, their physiology,
and/or their evolut ionary hist or y are described. Tens to hundr eds of t ypes are com-
pared in a single book. Except ional behaviors or examples are highlighted. The
amount o f text d evoted to the behavior o f a par ticular t ype o f mosquito could be
readily contained in less than a single chap ter. On the other hand,a book devoted to
tiger s may describe only behavior (e.g., not physiology), and one devoted to apes
C o m p l e x i t y e s t i m a t i o n 761
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 761
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 761
would describe only a particular individual in a manner that is limited to only part of
its behaviors.
Does the difference in texts describing insects and tigers reflect the social priori-
ties of human beings? This appears to be difficult to support. The mosquito is much
more relevant to the well-being of human beings than the tiger. Mosquitoes are eas-
ier to study in captivity and are more readily available in the wild. There are films that
enable us to obser ve the mosquito behavior at its own scale rather than at our usual
larger scale. Despite such films,there is no book-length descript ion of the behavior of
a mosquit o. This is t rue despite the importance of knowledge of its behavior to pre-
vention of various diseases. Even if there is some degree o f subjectivity to the com-
plexity estimates obtained from the lengths of descriptions found in books,the use of
existing books is a reasonable first attempt to o btain complexity estimates fr om the
information that has b een compiled by human b eings. We can also argue that when
ther e is greater exper ience with complexity and complexity estimation,our ability to
use intuition or existing t exts will improve and become impor tant tools in complex-
it y estimat ion.
Before applying this methodology, however, we should understand mo re car e-
fully the basic relationship of language to complexity. We have alrea dy discussed in
Section 1.8 the information in a string of English characters.A first estimate of 4.8 bits
per character could be based upon the existence of 26 letters and 1 space. In
Question 1.8.12,the best estimate obtained was 3.3 bits per character using a Markov
chain model that included cor relations between adjacent characters. To obtain an
even better estimate, we need to have a model that includes longer-range correlations
between characters. The most reliable estimates have been obtained by asking people
to guess the next char acter in an English t ext. It is assumed that people have a highly
sophisticated model for the st ructure of English and that the individual has no spe-
cific knowledge of the text. The guesses were used to establish bounds on the infor-
mation content. We can summarize these bounds as 0.9±0.3 bits/character. For our
present discussion, the difference between high and low bounds (a factor of 2) is not
significant. For convenience we will use 1 bit/character for our conversion factor. For
larger quant ities of text, this corresponds to values given in Table 8.4.1.
Our esti m a te of i n form a ti on in text has assu m ed a st ri ct ly nar ra tive English tex t .
We should also be con cern ed abo ut figures t hat accom p a ny de s c ri pt ive materi a l s . Doe s
the conven ti onal wi s dom of “a pictu re is wor th a thousand word s” m a ke sense? We can
con s i der t his bot h from the point of vi ew of d i rect com pre s s i on of the pictu re , and the
762 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 762
Title: Dynamics Complex Systems
Shor t / Normal / Long
Amount of text Information in text Text with figures
1 char 1 bit -
1 page · 3000 char 3x10
3
bit 10
4
1 chapter · 30 pages 10
5
bit 3x10
5
1 book · 10 chapters 10
6
bit 3x10
6
Ta ble 8 . 4 . 1 I n forma t ion e st ima t e s for st ra ight English t ext a n d illust ra t e d t ext . 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 762
po s s i bi l i t y of rep l acing the figure by de s c ri ptive tex t . A thousand words corre s pon d s
to 5 × 1 0
3
ch a r acter s or bi t s ,a bo ut two pages of tex t . De s c ri ptive figures su ch as gra ph s
or diagr ams of ten consist of a few lines that can be con c i s ely de s c ri bed using a formu l a
and would have a small er com p l ex i ty. P h o togra phs are for m ed of h i gh ly correl a ted
gra phical infor m a ti on that can be com pre s s ed . In a bl ack and wh i te ph o togra ph 5 ×1 0
3
bits would corre s pond to a 70 × 70 grid of com p l etely indepen dent pixel s . If we rec a ll
that we are not intere s ted in small det a i l s , this seems re a s on a ble as an upper bo u n d .
Moreover, the text that accompanies a figur e gen era lly de s c ri bes it s essen tial con ten t .
Thus wh en we ask the key qu e s ti on — wh et h er two pages of text would be su f f i c i ent to
de s c ri be a typical figure and rep l ace its funct i on in the text —t his seems a som ewh a t
gen erous but not en ti rely unre a s on a ble va lu e . A figure typ i c a lly occupies half of a page
that would be otherwise occ u p i ed by tex t . Thu s , for a high ly illu s tra ted boo k , on aver-
a ge containing one figure and on e - h a l f p a ge of text on each page , our esti m a te of t h e
i n for m a ti on con tent of the book would increase from 10
6
bit s by a factor of 2.5 to
ro u gh ly 3 × 1 0
6
bi t s . If t h ere is one pictu re on ever y two page s , t he inform a ti on con-
tent of the book would be do u bl ed r a t h er t han tri p l ed . While it is not re a lly essen ti a l
for our level of prec i s i on , it seems re a s on a ble to adopt the conven ti on that esti m a te s
using de s c ri pti ons of beh avi or al com p l ex i t y inclu de figure s . We wi ll do so by incre a s-
ing the previous va lues by a factor of 3 ( Ta ble 8.4.1). This wi ll not ch a n ge any of t h e
con clu s i on s .
Ther e is another aspect of the relationship of language to complexity. A language
uses individual words (like “frog”) to represent complex phenomena or systems (like
the physical system we call a frog). The complexity of the word “frog” is not the same
as the complexity of the frog. Why is this possible? According to our discussion of al-
gorithmic complexity, the smallest possible representation of a complex system has a
length in bits which is equal to the syst em complexity. Her e we have an example of a
system—frog—whose representation “frog” is manifestly smaller than its complexity.
The resolution of this puzzle is through the concept of recognition complexity
discussed in Section 8.3.7.A word is a member of an ensemble of words,and the sys-
tems that are described by these words are an ensemble of systems. It is only necessar y
that the ensemble o f words be mat ched to the ensemble o f syst ems described by the
words,not the whole ensemble of possible systems. Thus,the complexity of a word is
not relat ed to the complexity of the system, but rather to the complexity of specifying
the system—the logarithm of the number of systems that are part of the shared ex-
perience of the individuals who are communicating. This is the cent ral point of recog-
nition complexity. For a human being with exper ience and memory of only a limited
number of the set of all complex syst ems, to describe a system one must identify it
only in comparison with the systems in memory, not with those possible in principle.
Another way to think about this is to consider a human being as analogous to a
special UTM with a set of shor t representations that the UTM can expand to a spe-
cific limit ed subset o f possible long descrip tions. For example, having memorized a
play by Shakespeare,it is only necessary to invoke the name to ret rieve the whole play.
This is,indeed,the essence of naming—a name is a shor t reference to a complex sys-
tem. All words are names of more complex entit ies.
C o m p l e x i t y e s t i m a t i o n 763
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 763
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 763
In this way, language provides a systematic mechanism for compression of infor-
mation. This implies that we should not use the length of a word to estimate the com-
plexity of a system that it refers to. Does this also invalidate the use of human language
to obtain complexity estimates? On one hand, when we are asked to describe the be-
havior of a frog, we assume that we must describe it without reference to the name it-
self.“It behaves like a frog” is not a sufficient descript ion. There is a presumption that
a descrip tion of behavior is made to someone without specific knowledge. An est i-
mate of the complexity of a fr og would be much higher than the complexity of the
word “frog.” On the other hand,the words that would be used to describe a fr og also
refer to complex entities or actions. Consistency in different estimates of the amount
of text necessary to describe a frog might arise from the use of a common language
and exper ience. We could expand the descript ion further by r equiring that a person
explain not only the b ehavior of the frog, but also the meaning o f each of the wor ds
used to describe the behavior of the frog. At this point ,however, it is more constr uc-
tive to keep in mind the subtle relationship between language and complexity as par t
of our uncer tainty, and take the given estimates at face value. Ultimately, the com-
plexity of a syst em is defined by the condition that all possible (in principle) behav-
iors of the same complexity could be described using the same length of text. We ac-
cept the possibility that language-based estimates of complexity of biological
organisms may be systematically too small because they are common and familiar. We
may nevertheless have relative complexities est imated cor rectly.
Finally, we can argue that when we estimate the complexity of systems that ap -
proach the complexity o f a human b eing, the estimation p roblems becomes less se-
vere. This follo ws b ecause o f our discussion of universality o f complexity given in
Section 8.2.2.Specifically, that the more complex a system is,the less relevant specific
knowledge is, and the more universal are estimates of complexity. Never theless, ulti-
mately we will conclude that the inherent compression in use of language f or de-
scribing familiar complex syst ems is the greatest contributor to uncertainty in com-
plexity estimates.
Ther e is another approach to the use of human intuition and language in est i-
mating complexit y. This is by reference to computer languages. For someone familiar
with computer simulation, we can ask for the length of the computer progr am that
can simulate the behavior of the system—more specifically, the length of the program
that can simulate a frog. Computer languages are generally not ver y high in infor ma-
tion content, because there are a few commands and variables that are used through-
out the program. Thus we might estimate the complexity of a progr am not by char -
acters, but by program lines at several bits per program line. Consistent with the
definition of algor ithmic complexity, the estimate of system complexity should also
include the complexity o f the compiler and of the computer oper ating syst em and
hardware. Compilers and operating systems are much more complex than many pro-
grams by themselves. We can bypass this p roblem by consider ing instead the size of
the execution module—after application of the compiler.
Ther e are other p roblems with the use of natural or ar t ificial language descrip-
tions, including:
764 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 764
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 764
1. Overestimation due to a lack of knowledge of possible representations. This
problem is r elated to the difficulty of deter mining the compressibility of infor-
mation. The assumption of a par ticular length of text presumes a kind of r epre-
sentation. This choice of representation may not be the most compact. This may
be due to the for m of the representation—specifically English text. Alter natively,
the assumption may be in the conceptual (semantic) framework. An example is
the complexity o f the motion of the planets in the Ptolemaic (earth-centered)
representation compared to the Copernican (sun-centered) repr esentation.
Ptolemy would give a larger complexity estimate than Copernicus b ecause the
Ptolemaic syst em r equires a much longer descrip tion—which is the reason the
Copernican system is accepted as “true” today.
2. Underestimation due to lack of knowledge of the full behavior of the syst em. If
an individual is familiar with the behavior o f a syst em only under limit ed cir -
cumstances,the presumption that this limited knowledge is complete will lead to
a complexity estimate that is too low. Alternat ively, lack of knowledge may also
result in too high estimates if the individual extrapolates the missing knowledge
from more complex systems.
3. Difficulty with counting. Large numbers are gener ally difficult for people to
imagine or estimate. This is the advantage of identifying number s with length of
text, which is generally a more familiar quantit y.
With all of these limitations in mind, what are some of the estimates that we have
obtained? Table 8.4.2 was constructed using various books. The lengths of linguistic
descriptions of the behavior of biological organisms range from several pages to sev-
er al books. Insects and fish are at pages,frogs at a chapter, most mammals at approx-
imately a book, and monkeys and apes at several books. These numb er s span the
r ange of complexity estimates.
We have concluded that it is not possible to use this approach to obtain an est i-
mate of human complexity. However, this is not quite t rue. We can apply this method
by taking the highest complexity estimate o f other syst ems and using this as a close
lower bound to the complexity of the human b eing. By close lower bound we mean
that the actual complexity should not be t remendously gr eater. According to our
C o m p l e x i t y e s t i m a t i o n 765
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 765
Title: Dynamics Complex Systems
Shor t / Normal / Long
Animal Text length Complexity (bits)
Fish a few pages 3x10
4
Grasshopper, Mosquito a few pages to a chapter 10
5
Ant (one, not colony) a few pages to a chapter 10
5
Frog a chapter or two 3x10
5
Rabbit a short book 10
6
Tiger a book 3x10
6
Ape a few books 10
7
Ta ble 8 . 4 . 2 Est ima t e s of t he a pproxima t e le n gt h of t e xt de script ion s of a n ima l be h a vior 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 765
experience,the complexity estimates of animals tend to extend up to roughly a single
book. Pr imates may be estimated somewhat higher, with a range of one to tens of
books. This suggests that human complexity is somewhat larger than this latter num-
ber—approximately 10
8
bits, or about 30 books. We will see how this compares t o
other estimates in the following sect ions.
Ther e are sever al other approaches to estimating human complexity based upon
language. The existence of book-length biographies implies a poor estimate of human
complexity of 10
6
bits. We can also estimate the complexity of a human being by the
t ypical amount of information that a person can learn.Specifically, it seems to make
sense to base an estimate on the length of a college education, which uses approxi-
mately 30 text books. This is in direct agreement with the previous estimate of 10
8
bits.
It might be argued that this estimate is too low b ecause we have not inc luded other
parts of the education (elementary and high school and postgr aduate education) or
other kinds of education/information that are not academic. It might also be argued
that this is too high because students do not act ually know the entire content of 30
textbooks. One reason this numb er appears reasonable is that if the complexity of a
human being were much greater than this,there would be individuals who would en-
dure tens or hundreds of college educations in different subjects. The estimate of
roughly 30 textbooks is also consistent with the general upper limit on the number of
books an individual can wr ite in a lifetime. The most prolific author in moder n times
is Isaac Asimov, with about 500 books. Thus from such text-based self-consistent ev-
idence we might assume that the estimate of 10
8
bits is not wrong by more than one
to two orders of magnitude. We now turn to estimation methods that are not based
on text.
8 . 4 . 2 Genet ic code
Biological organisms present us with a convenient and explicit representation for their
formation by development—the genome. It is generally assumed that most of the in-
formation needed to describe the physiology of the organism is contained in genetic
information. For simplicity we might think of DNA as a kind o f progr am that is in -
terpreted by decoding machiner y during d evelopment and operation. In this r egard
the genome is much like a Turing machine tape (see Section 1.9), even though the
mechanism for transcription is quite different from the conventional Turing machine.
Some other perspect ives are given in Section 7.1. Regardless of how we ultimately view
the developmental process and cellular function, it appears natural to associate with
the genome the infor mation that is necessary to specify physiological design and func-
tion. It is not difficult to determine an upper bound to the amount of information
that is contained in a DNA sequence. Taken at face value,this provides us with an es-
timate of the complexity of an organism. We must then inquire as to the approxima-
tions that are being made. We first discuss the approach in somewhat greater detail.
Considering the DNA as an alphabet of four characters provided by the four nu-
cleotides or bases r epresented by A (adenine) T (t yrosine) C (cytosine) G (guanine),
a first estimate of the infor mation contained in a DNA sequence would be
N log(4) · 2N. N is the length of the DNA chain. Since DNA is for med of two com-
766 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 766
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 766
plementary nucleotide chains in a double helix, its length is measur ed in base pairs.
While this estimate neglects many corrections, there are a number o f assumpt ions
that we are making about the organism that give a larger uncer tainty than some of the
corrections that we can apply. Therefore as a rough estimate,this is essentially as good
an estimate as we can obtain from this methodology at present .Specific numbers are
given in Table 8.4.3. We see that for a human being, the estimate is nearly 10
10
bits,
which is somewhat larger than that obtained fr om language-based estimates in the
previous sect ion. What is more remarkable is that there is no syst ematic t rend of in-
creasing genome length that parallels our expectations of increasing organism com-
plexity based on estimates of the last sect ion. Aside fr om the increasing trend fr om
bacter ia to fungi to animals/plants,there is no apparent trend that would suggest that
genome length is correlated with our expectat ions about complexity.
We now p roceed to discuss limitations in this approach. The list of approxima-
tions given below is not meant to be exhaustive, but it does suggest some of the diffi-
culties in deter mining the information content even when there is a clear first nu-
mer ical value to start from.
a. A significant percentage o f DNA is “non-coding.” This DNA is not t ranscribed
for protein st r uctures. It may be r elevant to the st ructural proper ties of DNA. It
may also contain other useful infor mation not directly relevant to protein se-
quence. Never theless, it is likely that information in most of the base pairs that
are non-coding is not essential for organism behavior. Specifically, they can be re-
placed by many other possible base pair sequences without effect. Since
30%–50% of human DNA is estimated to be coding, this cor rection would r e-
duce the estimated complexit y by a factor of two to three.
b. Di rect forms of com pre s s i on : as pre s en t ly under s tood ,D NA is pri m a ri ly uti l i zed
t h ro u gh tr a n s c ri pti on to a sequ en ce of amino ac i d s . The coding for each amino
acid is given by a tr iple of b a s e s . Si n ce t here are many more triples (4
3
· 64) t han
amino acids ( twen t y) some of the sequ en ces have no amino acid co u n terp a rt ,a n d
C o m p l e x i t y e s t i m a t i o n 767
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 767
Title: Dynamics Complex Systems
Shor t / Normal / Long
Organism Genome length (base pairs) Complexity (bits)
Bacteria (E. coli) 10
6
–10
7
10
7
Fungi 10
7
–10
8
10
8
Plants 10
8
–10
11
3x10
8
–3x10
11
Insects 10
8
–7x10
9
10
9
Fish (bony) 5x10
8
–5x10
9
3x10
9
Frog and Toad 10
9
–10
10
10
10
Mammals 2x10
9
–3x10
9
10
10
Man 3x10
9
10
10
Ta ble 8 . 4 . 3 Est ima t e s of comple xit y ba se d upon ge n ome le n gt h . Exce pt for pla n t s, wh e re
t h e re is a pa rt icula rly wide ra n ge of ge n ome le n gt hs, a sin gle n umbe r is give n for t h e in for-
ma t ion con t a in e d in t h e ge n ome, be ca use t h e a ccura cy doe s n ot just ify more spe cific n um-
be rs. Ge n ome le ngt h s a nd ra nge s a re re pre se nt a t ive . 
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 767
t h ere are more than one sequ en ce that map on to t he same amino ac i d . This re-
dundancy means that there is less inform a ti on in the DNA sequ en ce . Taking this
i n to account by assigning a triple of bases to one of t wen ty ch a r acters that repre-
s ent amino acids would give a new esti m a te of (N / 3 ) l og( 20) · 1 . 4N. To improve
the esti m a te fur t h er, we would inclu de the rel a tive prob a bi l i t y of the differen t
amino ac i d s , and correl a ti ons bet ween them .
c. Gener al compression: more gener ally, we can ask how compressed the DNA en-
coding of information is. We can r ely upon a basic opt imization of funct ion in
biology. This might suggest that some degree of compression is perfor med in or-
der to reduce the complexity of t ransmission of the information from gener ation
to generation. However, this is not a p roof, and one could also argue in favor of
redundancy in order to avoid susceptibility to small changes. Moreover there are
likely to be inherent limitations on the compressibility of the infor mation due to
the possible transcription mechanisms that ser ve instead of decompression algo-
r ithms. For example,ifa molecule that is to be represented has a long chain of the
same amino acid, e.g., asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-
asp-asp-asp-asp-asp, it would be int eresting if this could be r epresented using a
chemical equivalent of (18)asp. This requires a tr anscript ion mechanism that re-
peats segments—a DNA loop. There are organisms that are known to have highly
repet itive sequences (e.g., 10
7
r epetitions) forming a significant fr action of their
genome. Much of this may be non-coding DNA.
Other forms of compression may also be r elevant. For example, we can ask
if there are protein components/subchains that can be used in more than one
protein. This is relevant to the gener al redundancy of protein design. There is ev-
idence that the genome does uses this p roper ty for compression by overlapping
the r egions that code for sever al different proteins. A par ticular r egion of DNA
may have several coding regions that can be combined in different ways to obtain
a number of different proteins. Transcription may start from distinct initial
points. Presumably, the information that describes the patter n of transcriptions
is represented in the noncoding segments that are between the coding segments.
Related to the issue of DNA code compression are questions about the complex-
ity of protein pr imary structure in relation to its own function—specifically, how
much infor mation is necessary to describe the function of a protein. This may be
much less than the infor mation necessary to specify its pr imary struct ure (amino
acid sequence). This discussion is approaching issues of the scale at which com-
plexity is measured—at the atomic scale where the specific amino acid is relevant,
or at the molecular scale at which the enzymatic function is relevant. We will
mention this limitation again in point (d).
d. Scale of representation:the genome codes for macromolecular and cellular func-
tion of the biological organism. This is much less than the microscopic ent ropy,
since it does not code the atomic vib rations or molecular diffusion. However,
since our concern is for the organism’s macroscopic complexit y, the DNA is likely
to be coding a far greater complexity than we are interested in for multicellular
768 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 768
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 768
organisms. The assump tion is that much of the cellular chemical activity is not
relevant to a description of the behavior on the scale of the organism. If the DNA
were r epresenting the sum o f the molecular or cellular scale complexity of each
of the cells independently, then the error in estimating the complexity would be
quite large. However, the molecular and cellular b ehavior is generally r epeated
throughout the organism in different cells. Thus, the DNA is essentially r epre-
senting the complexity of a single cellular function with the additional compli-
cation of representing the variation in this funct ion. To the extent that the com-
plexity of cellular behavior is smaller than that of the complete organism,it may
be assumed that the greatest part of the DNA code represents the macroscale be-
havior. On the other hand, if the organism behavior is comparatively simple,the
greater part of the DNA representation would be d evoted to describing the cel-
lular behavior.
e. Completeness of representation: we have assumed that DNA is the only source of
cellular information. However, during cell di vision not only the DNA is trans-
ferred but also other cellular structures,and it is not clear how much infor mation
is necessary to specify their function. It is clear, however, that DNA does not con-
tain all the infor mation. Otherwise it would be possible to t ransfer DNA from
one cell into any other cell and the organism would function through control by
the DNA. This is not the case. However, it may ver y well be that the description
of all other parts of the cell, including the tr anscript ion mechanisms, only in -
volves a small fraction of the information content compared to the DNA (for ex-
ample,10
4
–10
6
bits compared to 10
7
–10
11
bits in DNA). Similar to our point (d),
the information in cellular structures is more likely to be irrelevant for organisms
whose complexity is high. We could note also that there are two sources of DNA
in the eukaryotic cell, nuclear DNA and mitochondrial DNA. The information in
the nuclear DNA dominates over the mitochondrial DNA, and we also expect it
to d ominate over other sources of cellular infor mation. It is possible, however,
that the other sources of information approach some fr action (e.g., 10%) of the
infor mation in the nuclear DNA, causing a small cor rection to our estimat es.
f. We have implicitly assumed that the development process of a biological organ-
ism is deter ministic and uniquely determined by the genome. Randomness in the
process of development gives rise to additional information in the final str ucture
that is not contained in the genome. Thus, even organisms that have the same
DNA are not exa ctly the same. In humans, identical twins have been studied in
order to determine the difference between environmental and genet ic influence.
Here we are not considering the macroscale environmental influence, but rather
the microscale influence. This influence begins with the randomness of molecu-
lar vibrations during the developmental process. The additional information
gained in this way would have to play a relatively minor functional role if there is
significance to the genetic cont rol over physiology. Nevertheless,a complete esti-
mate of the complexity of a system must include this information. Without con-
sidering different scales of st ructure or behavior, on the macroscale we should
C o m p l e x i t y e s t i m a t i o n 769
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 769
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 769
not expect the microscopic randomness to affect the complexity by more than a
factor of 2,and more likely the effect is not more than 10% in a t ypical biologi-
cal organism.
g. We have also neglected the macroscale environmental influences on behavior.
These are usually described by adaptation and learning. For most biological or-
ganisms,the environmental influences on behavior are believed to be small com-
pared to genet ic influences. Instinct ive b ehaviors dominate. This is not as t rue
about many mammals and even less tr ue about human beings. Therefore,the ge-
netic estimate becomes less reliable as an upper bound for human beings than it
is for lower animals. This point will be discussed in great er detail below.
We can see that the assump tions discussed in ( a), (b), (c) and ( d ) would lead t o
the DNA length being an overly large estimate of the complexity. Assumptions dis-
cussed in (e), (f ) and (g) imply it is an underest imate.
One of the conceptual difficulties that we are presented with in considering
genome length as a complexity estimate is that plants have a much higher DNA length
than animals. This is in conflict with the conventional wisd om that animals have a
greater complexity of behavior than plants.We might adopt one of two approaches to
understanding this result: first, that plants are actually more complex than animals,
and second, that the DNA representation in plants does not make use of, or cannot
make use of, compression algor ithms that are present in animal cells.
If plants are syst ematically more complex than animals, there must be a general
quality of plants that has higher descriptive and behavior al complexity. A candidat e
for such a propert y is that plants are gener ally able to regenerate after injury. This in-
herently requires mor e infor mation than the reliance upon a specific time histor y for
development. In essence,there must be some form of act ual blueprint for the organ-
ism encoded in the genome that takes into account many possible circumstances.
From a programming point of view, this is a multiply reent r ant program. To enable
this feature may very well be more complex, or it may require a more redundant
(longer) representation of the same information. It is presumed that the struct ure of
animals has such a high intr insic complexity that r epresentation of a fully r egener a-
tive organism would be impossible. This idea might be checked by considering the
genome length of animals that have great er ability to r egenerate. If they are substan-
tially longer than similar animals without the ability to regenerate, the explanation
would be supported. Indeed, the salamander, which is the only vertebr ate with the
ability to regener ate limbs, has a genome of 10
11
base pairs. This is much larger than
that of other vertebrat es, and comparable to that of the largest plant genomes.
A more general reason for the high plant genome complexity that is consistent
with r egener ation would be that plants have syst ematically d eveloped a high com-
plexity on smaller (molecular and cellular) rather than larger (organismal) scales.
One reason for this would be that plant immobility requires the development of com-
plex molecular and cellular mechanisms to inhibit or survive par tial consumption by
other organisms. By our discussion of the complexity p rofile in Section 8.3, a high
complexity on small scales would not allow a high complexity on larger scales. This
770 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 770
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 770
explanation would also be consistent with our understanding of the relative simplic-
it y of plants on the larger scale.
The second possibility is that there exists a syst ematic additional redundancy of
the genome in plants. This might be the result of par ticular proteins with chains of
repet itive amino acids.A protein for med out of a long chain of the same amino acid
might be funct ionally of importance in plants,and not in animals. This is a potential
explanation for the relative lengths of plant genome and animal genome.
One of the most striking features of the genome lengths found for various or-
ganisms is their relative uniformit y. Widely different types of organisms have similar
genome lengths, while similar organisms may have quite different genome lengths.
One explanation for this that might be suggested is that genome lengths have in-
creased systematically with evolutionary time. It is hard, however, to see why this
would be the case in all but the simplest models of evolution. It makes more sense to
infer that there are const raints on the genome lengths that have led it to gr avitate to-
ward a value in the range 10
9
–10
10
. Increases in organism complexity then result from
fewer r edundancies and b etter compression, rather than longer genomes. In pr inci-
ple, this could account for the pattern of complexities we have obtained.
Regardless o f the ultimate reason for various genome lengths, in each case the
complexity estimate from genome length provides an upper bound to the genet ic
component of organism complexity (c.f. points (e), (f ) and (g) above). Thus,the hu-
man genome length provides us with an est imate of human complexit y.
8 . 4 . 3 Component count ing
The objective of complexity estimation is to determine the behavior al complexity of
a system as a whole. However, one of the impor tant clues to the complexity of the sys-
tem is its composition from elements and their interactions. By counting the number
of elements, we can develop an understanding of the complexity of the system.
However, as with other estimation methods,it must be understood that there are in-
herent problems in this approach. We will find that this method gives us a much
higher estimate than the other methods. In using this method we are faced with the
dilemma that lies at the heart of the ability to understand the nature of complex sys-
tems—how does complex behavior arise out of the component behavior and their in-
ter actions? The essential question that we face is: Assuming that we have a system
for med of N inter acting elements that have a complexit y C
0
(or a known distribution
of complexities),how can the complexit y C of the whole system be determined? The
maximal possible value would be NC
0
. However, as we discussed in Section 8.3,this is
reduced both by correlations between elements and by the change o f scale from that
of the elements to that of the system. We will discuss these problems in the context of
estimating human complexit y.
If we are to consider the behavioral complexity of a human being by counting
components, we must identify the relevant components to count. If we count the
number of atoms, we would be describing the microscopic complexit y. On the other
hand, we cannot count the number of parts on the scale of the organism (one)
because the p roblem in d etermining the complexity r emains in evaluating C
0
. Thus
C o m p l e x i t y e s t i m a t i o n 771
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 771
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 771
the object ive is to select components at an intermediate scale. Of the natural inter-
mediate scales to consider, there are molecules, cells and organs. We will tackle the
problem by considering cells and discuss difficulties that arise in this context. The first
difficulty is that the complexity of behavior does not arise equally from all cells. It is
gener ally under stood that muscle cells and bone cells are largely unifor m in struct ure.
They may therefore collect ively be describ ed in t erms of a few parameters, and their
contribution to organism behavior can be summarized simply. In contrast,as we dis-
cussed in Chapter 2,the behavior of the system on the scale of the organism is gener-
ally attr ibuted to the ner vous system. Thus,aside from an inconsequential number of
additional parameters, we will consider only the cells of the nervous system. If we were
considering the behavior on a smaller length scale, then it would be natural to also
consider the immune system.
In order to make more progress, we must discuss a specific model for the ner vous
system and then d etermine its limitations. We can do this by considering the behav-
ior of a model system we studied in detail in Chapter 2—the attractor neural networ k
model.Each of the neurons is a binary variable. Its behavior is specified by whether it
is ON or OFF. The behavior of the network is,however, described by the values of the
synapses. The total complexity of the synapses could be quite high if we allowed the
synapses to have many digits of precision in their values, but this does not contr ibute
to the complexity of the networ k behavior. Given our investigation of the stor age of
patter ns in the network, we can argue that the maximal number of independent pa-
r ameters that may be specified for the operation of the network consists of the neural
firing patterns that are stor ed. This corresponds to
c
N
2
bits of information, where N
is the numb er of neurons,and
c
≈ 0.14 is a number that arose fr om our analysis of
network overload.
Th ere are several probl ems with app lying this for mula to bi o l ogical ner vous sys-
tem s . The first is that the bi o l ogical net work is not fully con n ected . We could app ly a
similar formula to the net work assuming on ly the nu m ber of synapses N
s
that are pre-
s en t , on avera ge , for a neu ron . This gives a va lue
c
N
s
N. This means that the stora ge ca-
p ac i ty of the net work is small er, and should scale with the nu m ber of s y n a p s e s . For the
human brain wh ere N
s
has been esti m a ted at 10
4
and N ≈ 1 0
1 1
, this would give a va lu e
of 0.1 × 1 0
4
× 1 0
1 1
· 1 0
1 4
bi t s . The probl em with this esti m a te is that in order to spec i f y
the beh avi or of the net wor k , we need to specify not on ly the impri n ted patterns but also
wh i ch synapses are pre s ent and wh i ch are absen t .L i s ting the synapses that are pre s en t
would requ i re a set of nu m ber pairs that would specify wh i ch neu rons each neu ron is
a t t ach ed to. This list would requ i re ro u gh ly N N
s
l og (N ) · 3 × 1 0
1 6
bi t s , wh i ch is larger
than the nu m ber of bits of i n for m a ti on in the stora ge itsel f . This esti m a te may be re-
du ced by a small amount, i f , as we ex pect , the synapses of a neu ron largely con n ect to
n eu rons that are nearby. We wi ll use 10
1 6
as the basis for our com p l ex i ty esti m a te .
The second major problem with this mo del is that real neur ons are far fr om bi-
nary variables. Indeed, a neuron is a complex system. Each neuron responds to par -
ticular neurot ransmitt ers, and the synapse b etween two sp ecific neur ons is different
from other synapses. How many parameters would be needed to describe the behav-
ior of an individual neuron,and how relevant are these parameters to the complexity
772 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 772
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 772
of the whole system? Naively, we might think that taking into account the complexity
of individual neur ons gives a much higher complexity than that considered ab ove.
However, this is not the case. We assume that the parameters necessary to describe an
individual neuron cor respond to a complexity C
0
, and it is necessary to specify the pa-
rameters of all of the neur ons. Then the complexity of the whole system would in-
clude C
0
N bits for the neur ons themselves. This would be greater than 10
16
bits only
if the complexity of the indi vidual neur ons were larger than 10
5
. A reasonable est i-
mate of the complexity of a neuron is roughly 10
3
–10
4
bits. This would give a value of
C
0
N · 10
13
−10
14
bits, which is not a significant amount by comparison with 10
16
bits.
By these estimates,the complexity of the internal str ucture of a neuron is not great er
than the complexit y of its interconnect ions.
Similarly, we should consider the complexity of a synapse, which multiplies the
number of synapses. Synapses are significantly simpler than the neurons. We may es-
timate their complexity as no more than 10 bits. This would be sufficient to specify
the synaptic strength and the type of chemicals involved in transmission. Multiplying
this by the total number of synapses (10
15
) gives 10
16
bits. This is the same as the in -
format ion necessar y to specify the list of synapses that are present.
Combining our estimates for the information necessary to sp ecify the st r ucture
of neurons,the structure of synapses and the list of synapses present, we obtain an es-
timate for complexity of 10
16
bits. This estimate is significantly larger than the esti-
mate found fr om the other two approaches. As we mentioned before, there are two
fundamental difficulties with this approach that make the estimate too high—
correlations among par ameters and the scale of descript ion.
Ma ny of the para m eters enu m era ted above are likely to be the same, giving rise to
the po s s i bi l i t y of com pre s s i on of the de s c ri pti on . Both the de s c ri pti on of an indivi du a l
n eu ron and the de s c ri pti on of the synapses bet ween them can be dra s ti c a lly simplified
i fa ll of t h em fo ll ow a pattern . For ex a m p l e , the vi sual sys tem invo lves processing of a vi-
sual field wh ere the different neu rons at different loc a ti ons perform essen ti a lly the same
oper a ti on on the vi sual infor m a ti on . Even if t h ere are smooth va ri a ti ons in the para-
m eters that de s c ri be both the neu ron beh avi or and the synapses bet ween them , we can
de s c ri be the processing of the vi sual field in ter ms of a small nu m ber of p a ra m eter s .
In deed , one would guess (an intu i ti on - b a s ed esti m a te) that processing of the vi sual fiel d
is qu i te com p l i c a ted (more than 10
2
bits) but would not exceed 10
3
– 1 0
5
bits altoget h er.
Si n ce a su b s t a n tial fracti on of the nu m ber of n eu rons in the brain is devo ted to initi a l
vi sual proce s s i n g, the use of this redu ced de s c ri pti on of the vi sual processing would re-
du ce the esti m a te of the com p l ex i ty of the whole sys tem .
Nevertheless,the initial visual processing does not involve more than 10% of the
number of neurons. Even if we eliminate all of their parameters,the estimate of sys-
tem complexity would not change. However, the idea behind this const ruct ion is that
whenever there are many neurons whose behavior can be grouped together into par-
ticular functions,then the complexity of the descript ion is reduced. Thus if we can de-
scribe neurons as belonging to a particular class of neurons (categor y or stereot ype),
then the complexity is reduced. It is known that neurons can be categor ized;however,
it is not clear how many parameters r emain once this cat egorization has been done.
C o m p l e x i t y e s t i m a t i o n 773
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 773
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 773
When we think about grouping the neurons t ogether, we might also realize that this
discussion is relevant to the consideration of the influence of environment and ge-
netics on behavior. If the numb er of parameters necessary to describe the network
greatly exceeds the number of parameters in the genetic code, which is only 10
10
bits,
then many of these parameters must be specified by the environment. We will discuss
this again in the next section.
On a more ph i l o s ophical note , we com m ent t hat para m et er s t hat de s c ri be the
n er vous sys tem also inclu de the mall e a ble short - t erm mem or y. While t his may be
a small par t of t he tot al infor m a ti on , our est i m a te of beh avi or al com p l ex i ty
should r aise qu e s ti ons su ch as, How specific do we have to be? Should the con ten t
of s h or t - term mem or y be inclu ded? The argument in favor would be that we need
to repre s ent the human being in en ti ret y. The ar gument against would be t hat
what happen ed in the past five minut es or even the past day is not r el evant and we
can re s et t his par t of t he mem or y. Even tu a lly we may ask wh et h er the obj ective is
to r epre s ent t he specific inform a ti on known by an indivi dual or just his or her
“ch a r acter.”
We have not yet dir ectly addressed the role of subst ructure (Chap ter 2) in the
complexity of the ner vous syst em. In comparison with a fully connected network, a
network with substructure is more complex because it is necessary to specify the sub-
struct ure, or more specifically which neurons (or which information) are proximate
to which. However, in a syst em that is subdivided by vir tue of having fewer synapses
between subdivisions, once we have counted the infor mation that is present in the se-
lection of synapses,as we have done above,the subst ructure of the system has already
been included.
The second problem of estimating complexity based on component counting is
that we do not know how to r educe the complexity estimate based upon an increase
of the length scale of obser vation. The estimate we have obtained for the complexity
of the ner vous system is relevant to a description of its behavior on the scale of a neu-
ron (it does, however, focus on cellular behavior most relevant to the behavior of the
organism). In order to overcome this problem, we need a method to assess the de-
pendence of the organism behavior on the cellular behavior. A natural approach
might be to evaluate the robustness of the system behavior to changes in the compo-
nents. Human beings are believed to lose approximately 10
6
neurons every day (even
without alcohol) corresponding to the loss of a significant fraction of the neurons
over the course of a lifetime. This suggests that individual neurons are not crucial t o
deter mining human behavior. It implies that there may be a couple of orders of mag-
nitude between the estimate of neuron complexity and human complexity. However,
since the daily loss of neurons corresponds only to a loss of 1 in 10
5
neurons, we could
also argue that it would be hard for us to notice the impact of this loss. In any event,
our estimate based upon component counting, 10
16
, is eight orders of magnitude
larger than the estimates obtained from text and six orders of magnitude larger than
the genome-based estimate. To account for this difference we would have to argue that
99.999% of neuron parameter s are ir relevant to human b ehavior. This is t oo gr eat a
discrepancy to dismiss based upon such an argument.
774 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 774
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 774
Finally, we can demonstrate that 10
16
is t oo large an estimate of complexity by
considering the counting of time rather than the counting of components. We con-
sider a minimal time inter val of describing a human being to be of order 1 se cond,
and we allow for each second 10
3
bits of information. There are of order 10
9
seconds
in a lifetime. Thus we conclude that only, at most,10
12
bits of information are neces-
sary to describe the act ions of a human. This estimate assumes that each second is in-
dependently described from all other seconds,and no patterns of behavior exist. This
would seem to be a ver y generous estimate. We can cont rast this number with an es-
timate of the total amount of infor mation that might be imprinted upon the
synapses. This can be estimated as the total number of neuronal states over the course
of a lifetime. For a neuron reaction time of order 10
−2
seconds,10
11
neurons,and 10
9
seconds in a lifetime, we have 10
22
bits of information. Thus we see that the total
amount of infor mation that passes through the ner vous system is much larger than
the infor mation that is represented there, which is larger than the infor mation that is
manifest in ter ms of behavior. This suggests either that the collect ive behavior of neu-
rons requires redundant infor mation in the synapses,as discussed in Section 8.3.6, or
that the actions of an individual do not fully represent the possible act ions that the in-
dividual would take under all circumstances. The latt er possibility ret urns us to the
discussion of Eq.(8.3.47) and Eq.(8.3.59), where we commented that the expression
is an upper bound, because information may cycle between scales or between system
and environment. Under these circumstances, the pot ential complexity o f a syst em
under the most diverse set of circumstances is not necessarily the obser ved complex-
it y. Both of our approaches to component counting (spatial and tempor al) may over-
estimate the complexit y due to this problem.
8 . 4 . 4 Complexit y of huma n beings, a rt ificia l int elligence,
a nd t he soul
We begin this section by summarizing the estimates of human complexity from the
previous sect ions,and then turn to some mor e philosophical considerations of its sig-
nificance. We have found that the microscopic complexity of a human being is in the
vicinity of 10
30
bits. This is much larger than our estimates of the macroscopic com-
plexity—language-based 10
8
bits, genome-based 10
10
bits and component (neuron)-
counting 10
16
bits. As discussed at the end of the last section, we r eplace the spatial
component-counting estimate with the time-counting up per bound of 10
12
bits. We
will discuss the discrepancies between these number s and conclude with an estimat e
of 10
10t2
bits.
We can summarize our understanding of the different estimates. The language-
based estimate is likely to be somewhat low because of the inher ent compression
achieved by language.One way to say this is that a college education, consisting of 30
textbooks, is based up on childhood learning (nonlinguistic and linguistic) that pro-
vides meaning to the words, and therefore contains comparable or greater informa-
tion. The genome-based complexity is likely to be a too-large estimate of the influence
of genome on behavior, because genome infor mation is compressible and because
much of it must be relevant to molecular and cellular funct ion. The component-
C o m p l e x i t y e s t i m a t i o n 775
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 775
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 775
counting estimate suggests that the infor mation obtained fr om exp erience is much
larger than the information due to the genome—specifically, that genetic infor mation
cannot specify the parameters of the neural network. This is consistent with our dis-
cussion in Section 3.2.11 that suggested that synapses store learned infor mation while
the genome determines the o verall st ructure of the networ k. We must still conclude
that most of the network infor mation is not relevant to behavior at the larger scale. It
is redundant, and /or does not manifest itself in human behavior because of the lim-
ited t ypes of external circumstances that are encount ered. Because of this last point,
the complexity for describing the response to arbitrary circumstances may be higher
than the estimate that we will give, but should still be significantly less than 10
16
bits.
Our estimate of the complexity of a human being is 10
10t2
bits. The er ror bars es-
sentially br acket the values we obtained. The main final caveat is that the difficulty in
assessing the possibility of information compr ession may lead to a systematic bias to
high complexities. For the following discussion,the actual value is less important than
the existence of an estimate.
Consideration of the complexity of a human being is intimat ely related to fun -
damental issues in art ificial intelligence. The complexity of a human b eing specifies
the amount of information necessary to describe and, given an environment, predict
the behavior of a human being. There is no presumpt ion that the prediction would be
feasible using present technology. However, in pr inciple,t here is an implication of its
possibilit y. Our o bject ive here is to briefly discuss both philosophical and practical
implications of this obser vation.
The notion of reproducing human behavior in a computer (or by other artificial
means) has t raditionally been a major domain of confrontation between science and
religion,and science and popular thought. Some of these conflicts arise because of the
supposition by some religious philosophers of a nonmaterial soul that is presumed to
animate human b eings. Such nonmat erial entities are rejected in the context of sci-
ence because they are, by definition,not measurable. It may be helpful to discuss some
of the alt ernate approaches to the traditional conflict that b ypass the cont roversy in
favor of slightly modified definitions.Specifically, we will consider the possibility of a
scientific definition of the concept of a soul. We will see that such a concept is not nec-
essarily in conflict with notions of ar tificial intelligence. Instead it is closely relat ed to
the assumpt ions of this field.
One way to define the concept of soul is as the information that describes com-
pletely a human being. We have just estimated the amount of this information. To un-
derstand how this is r elated to the r eligious concept of soul, we must realize that the
concept of soul serves a purpose. When an individual dies,the existence of a soul rep-
resents the independence of the human being from the mater ial of which he or she is
formed. If the mat erial of which the human being is made were essential to its func-
tion, then ther e would be no independent functional descript ion. Also, there would
be no mechanism by which we could repr oduce human behavior without making use
of precisely the atoms of which he or she was formed. In this way the descript ion of a
soul suggests an abst raction of function from matter which is consistent with ab-
str act ions that are familiar in science and modern thought, but might not be consis-
776 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 776
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 776
tent with more pr imitive notions of matter. A primit ive concept of matter might in -
sist that the matt er of which we are formed is essential to our functioning. The sim-
plest possible abstr act ion would be to state (as is claimed by physics) that the specific
atoms of which the human being are formed are not necessary to his or her function.
Instead, these atoms may be replaced by other indistinguishable atoms and the same
behavior will be found. Ar tificial intelligence takes this a large st ep further by stating
that there are other possible media in which the same behavior can be realized.A hu-
man being is not dir ectly tied to the mater ial of which he is ma de. Instead there is a
funct ional description that can be implemented in various media, of which one pos-
sible medium is the biological body that the human being was implemented in, when
we met him or her.
Viewed in this light, the stat ement o f the exist ence of a soul appears to be the
same as the claim of ar tificial intelligence—that a human being can be reproduced in
a different form by embodying the funct ion rather than the mechanism of the human
being. There is,however, a crucial distinct ion between the religious view and some of
the practical approaches of ar tificial intelligence. This difference is related to the no-
tion of a universal artificial intelligence, which is concept ually similar to the model of
universal Turing machines. According to this view there is a generic model for intelli-
gence that can be implemented in a computer. In contr ast,the religious view is t ypi-
cally focused on the individual identity of an individual human being as manifest in
a unique soul. We have discussed in Chap ter 3 that our mo dels of human beings are
to be understood as nonuniversal and would indeed be better realized by the concept
of representing individual human beings rather than a generic ar tificial int elligence.
There are common features to the information p rocessing of different indi viduals.
However, we anticipate that the features character istic of human b ehavior are pre-
dominantly sp ecific to each indi vidual rather than common. Thus the objective o f
creating ar tificial human beings might be b etter described as that of manifesting the
soul of an individual human.
We can illustrate this change in perspective by considering the Turing test for rec-
ognizing ar tificial int elligence. The Turing test suggests that in a conversation with a
computer we may not be able to distinguish it from a human being. A key problem
with this prescript ion is that there is no sp ecification of which human being is to be
modeled. Human beings have varied complexity, and interactions are of varied levels
of intimacy. It would be quite easy to reproduce the conversation of a mute individ-
ual, or even an obsessed individual. Which human being did Turing have in mind? We
can go beyond this object ion by recognizing that in order to fool us into thinking that
the computer is a human being, except for a ver y casual conversation, the computer
would have to represent a single human being with a name,a family hist or y, a profes-
sion, opinions and a personality, not an abst ract notion of intelligence. Finally, we
may also ask whether the represented human being is someone we already know, or
someone we do not know, pr ior to the test.
While we bypassed the fundamental controversy between science and religion re-
garding the presence of an immater ial soul, we suspect that the real conflict between
the approaches resides in a different place. This conflict is in the question of the
C o m p l e x i t y e s t i m a t i o n 777
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 777
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 777
intr insic value of a human being and his place in the universe. Both the religious and
popular view would like to place an importance on a human being that tr anscends the
value of the matt er of which he is f ormed. Philosophically, the scientific p er spective
has often been viewed as lower ing human worth. This is true whether it is physical sci-
entists that view the material of which man is formed as “just” composed of the same
atoms as r ocks and wat er, or whether it is biolo gical scientists that consider the bio-
chemical and cellular structures as the same as,and derived evolut ionarily from,an i-
mal processes.
The study of complexity presents us with an opportunity in this regard.A quan-
titative d efinition of complexity can provide a direct measure of the difference be-
tween the behavior of a rock,an animal and a human being. We should recognize that
this capability can be a double-edged sword. On the one hand it provides us with a
scientific method for distinguishing man from matter, and man from animal, by rec-
ognizing that the par ticular ar rangement of atoms in a human being, or the particu-
lar implementation of biology, achieves a funct ionality that is highly complex. At the
same time, by placing a number on this complexity it presents us with the finiteness
of the human being. For those who would like to view themselves as infinite,a finite
complexity may be humbling and difficult to accept. Other s who already r ecognize
the inher ent limitations of individual human beings,including themselves, may find
it comfort ing to know that this limitation is fundamental.
As is oft en the case,the value of a number attains meaning though comparison.
Specifically, we may consider the complexity of a human being and see it as either high
or low. We must have some reference point with respect to which we measure human
complexity. One r eference point was clear in the preceding discussion—that o f ani-
mals. We found that our (linguistic) estimates of human complexity placed human
beings quantitat ively above those of animals, as we might expect. This result is quite
reasonable but d oes not suggest any clear dividing line between animals and man.
Ther e is, however, an independent value to which these complexities can be com-
pared. For consistency, we use language-based complexity est imates throughout.
The idea of biological evolut ion and the biological continuity of man from ani-
mal is based up on the concept of the sur vival demands of the environment on man.
Let us consider for the moment the complexity of the demands of the environment.
We can estimate this complexity using relevant liter ature. Books that discuss survival
in the wild are t ypically quite sho r t, 3 × 10
5
bits. Such a book might describe more
than just basic survival—plants to eat and animal hunting—but also various skills of
a pr imitive life such as stone knives, tanning, basket making, and primit ive home or
boat constr uction. Alter natively, a book might discuss survival under extreme cir-
cumstances rather than survival under more t ypical circumstances. Even so, the
amount of text is not longer than a rather br ief book. While there are many individ-
uals who have devoted themselves to living in the wild,there are no encyclopedias of
relevant information. This suggests that in comparison with the complexity of a hu-
man being, the complexity of sur vival demands is small. Indeed, this complexity ap-
pears to be right at the estimated di viding line between animal (10
6
bits) and man
(10
8
bits). It is significant that an ape may have a complexity o f ten times the com-
778 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 778
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 778
plexity of the environmental demands upon it, but a human b eing has a complexit y
of a hundr ed times this demand. Another way to ar r ive at this conclusion is to con-
sider primitive man, or pr imitive tr ibes that exist today. We might ask about the com-
plexity of their existence and specifically whether the demands of the survival are the
same as the complexity of their lives. From books that reflect studies of such peoples
we see that the descrip tion of their sur vival techniques is much shorter than the d e-
scription of their social and cultural act ivities.A single aspect of their culture might
occupy a book, while the sur vival methods do not occupy even a single one.
We might compare the b ehavior of pr imitive man with the behavior of animal
predators. In contrast to gr azing animals, predator s satisfy their survival needs in
terms of food using only a small part of the day. One might ask why they did not de-
velop complex cultural activities. One might think, for example, of sleeping lions.
While they do have a social life, it does not compare in complexity to that of human
beings. The explanation that our discussion pr ovides is that while time would allow
cultural act ivities, complexity does not. Thus, the complexity of such predators is es-
sentially devoted to problems of sur vival. That of human beings is not.
This conclusion is quite intr iguing. Several interesting remarks follow. In this
context we can suggest that analyses of animal behavior should not necessarily be as-
sumed to apply to human behavior. In par ticular, any animal behavior might be jus-
tified on the basis of a survival demand. While this appr oach has also oft en been ap-
plied to human beings—the sur vival advantages associated with culture, art and
science have oft en been suggested—our analysis suggests that this is not justified, at
least not in a dir ect fashion. Human behavior cannot be dr iven by sur vival demands
if the survival demands are simpler than the human behavior. Of course,this does not
r ule out that general aspects or patterns of behavior, or even some specific behaviors,
are driven by sur vival demands.
One of the distinct ions between man and animals is the relative dominance of in-
stinctive behavior in animals,as compared to learned behavior in man. It is often sug-
gested that human dependence on learned rather than instinctive behavior is simply
a different strategy for survival. However, ifthe complexity of the demands of sur vival
are smaller than that of a human being, this does not hold. We can argue instead that
if the complexity of sur vival demands are limit ed, then there is no reason for addi-
tional instinctive behaviors. Thus, our results suggest that instinctive behavior is ac-
tually a b etter st rategy for overcoming sur vival demands—because it is prevalent in
organisms whose behavior arises in response to survival demands. However, once
such demands are met, there is little reason to produce mor e complex instinct ive be-
haviors, and for this reason human behavior is not instinctively driven.
We now turn to some more p r actical asp ects o f the implications of our com-
plexity estimates for the problem of art ificial intelligence—or the re-creation of an in-
dividual in an artificial form. We may start from the microscopic complexity (roughly
the ent ropy) which corresponds to the information necessary to replace ever y atom
in the human being with another atom of the same kind, or alternatively, to represent
the atoms in a computer. We might imagine that the computer could simulate the dy-
namics of the atoms in order to simulate the behavior of the human being. The
C o m p l e x i t y e s t i m a t i o n 779
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 779
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 779
pract icality of such an implementation is highly questionable. The problem is not just
that the number of bits of storage as well as the speed requirements are beyond mod-
ern technology. It must be assumed that any computer representation of this dynam-
ics must ultimat ely be composed of atoms. If the simulation is not composed out of
the at oms themselves, but some controllable r epresentation o f the at oms, then the
complexity of the machine must be significantly greater than that of a human being.
Moreover, unless the syst em is const ructed to respond to its environment in a man-
ner similar to the resp onse of a human being, then the comput er must also simulat e
the environment. Such a task is likely to be for mally as well as pract ically impossible.
One cent ral question then becomes whether it is possible to compress the repre-
sentation of a human being into a simpler one that can be stored.Our estimate o f be-
havioral complexity, 10
10±2
bits, suggests that this might be possible. Since a CD-ROM
contains 5 × 10
9
bits, we are discussing 2 × 10
±2
CD-ROMs. At the lower end of this
range, 0.02 CD-ROMs is clear ly not a p roblem. Even at the up per end, two hundred
CD-ROMs is well within the domain of feasibility. Indeed, even if we chose to repre-
sent the infor mation we estimated to be necessary to describe the neural network of
a single indi vidual, 10
16
bits or 2 million CD-ROMs, this would be a t echnologically
feasible project. We have made no claims about our ability to obtain the necessary in-
formation for one indi vidual. However, once this information is obtained, it should
be possible to stor e it.A computer that can simulate the behavior of this individual
represents a more significant problem.
Before we discuss the problem of simulating a human being, we might ask what
the additional microscopic complexity present in a human body is good for.
Specifically, if only 10
10
bits are relevant to human behavior, what are most of the 10
31
bits doing? One way to think about this question is to ask why nature didn’t build a
similar machine with of order 10
10
atoms, which would be significantly smaller. We
might also ask whether we would know if such an organism existed.On our own scale,
we might ask why nature doesn’t build an organism with a complexity of order 10
30
.
We have already suggested that there may be inherent limitations to the complexity
that can be formed. However, there may also be another use of some of the additional
large number of micr oscopic pieces of informat ion.
One possible use of the additional infor mation can be inferred fr om our argu-
ments about the difference between TM with and without a random tape. The dis-
cussion in Sect ion 1.9.7 suggests that it may be necessary to have a source o f ran-
domness to allow human qualities such as creativity. This fits nicely with our
discussion of chaos in complex syst em behavior. The implication is that the micro-
scopic information becomes gradually relevant to the macroscopic behavior as a
chaotic process. We can assume that most microscopic information in a human being
describes the position and orientation o f water molecules. In this pict ure, random
motion of molecules affects cellular behavior, specifically the firing of neurons, that
ultimately affects human behavior. This does not mean that all of the microscopic in-
for mation is relevant. Only a small number of bits can be relevant at any time.
However, we recognize that in order to obtain a certain number of random bits,there
must be a much larger reser voir of randomness. This is one approach to understand-
ing a possible use of the microscopic information content of a human being. Another
780 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 780
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 780
appr oach would ascribe the additional infor mation to the necessary sup port st ruc-
tures for the complex behavior, but would not attribute to it an essential role as
infor mation.
We have d emonst r ated time and again that it is possible to build a stronger or
faster machine than a human b eing. This has led some people to believe that we can
also build a systematically more capable machine—in the for m of a robot. We have al-
ready argued that the present notion of computers may not be sufficient if it becomes
necessary to include chaotic behavior. We can go beyond this argument by consider-
ing the problem we have introduced of the fundamental limits to complexity for a col-
lection of molecules. It may turn out that our quest for the design of a complex ma-
chine will be limited by the same fundamental laws that limit the design of human
beings.One of the natural improvements for the design of deterministic machines is
to consider lower t emper atures that enable lower error rates and higher speeds, and
possibly the use of superconductors. However, the choice of a higher temperature may
be required to enable a higher microscopic complexity, which also limits the macr o-
scopic complexit y. The mammalian body temperature may be selected to balance two
competing effects. At high temperatures there is a high microscopic complexity.
However, breaking the ergodic theorem requires low temper atures so that energy bar-
riers can be effect ive in stopping movement in phase space.A way to argue this point
more gener ally is that the sensit ivity of human ears and eyes is not limited by the bi-
ological design, but by fundamental limits of quantum mechanics. It may also be that
the behavioral complexity of a human being at its own length and time scale is lim-
ited by fundamental law. As with the existence of ar t ificial sensor s in other parts of the
visual spectr um, we already know that machines with other capabilities can be built.
However, this argument suggests that it may not be possible to build a systematically
more complex ar tificial organism.
The previous discussion is not a proof that we cannot build a robot that is more
capable than a human being. However, any claims that it is possible should be tem-
pered by the respect that we have gained from studying the effect iveness of biological
design. In this r egard, it is int eresting that some o f the modern approaches to ar tifi-
cial intelligence consider the use of nanotechnology, which at least in part will make
use of biological molecules and methods.
Finally we can say that the concept of an infinite human being may not be en-
tirely lost. Even the lowly TM whose inter nal (table) complexity is rather small can,in
ar bitr arily long time and with an infinite storage, reproduce ar bitrarily complex be-
havior. In this r egard we should not consider just the complexity of a human b eing
but also the complexity of a human being in the context of his tools. For example, we
can consider the complexity of a human being with paper and pen,the complexity of
a human being with a computer, or the complexity of a human being with access to a
libr ar y. Since human b eings make use of external st orage that is limited only by the
available matter, over time a human being, through collaboration with other human
beings/generations extending through time,can r eproduce complex behavior limited
only by the matt er that is available. This br ings us back to questions of the behavior
of collections of human beings, which we will address in Chapter 9.
C o m p l e x i t y e s t i m a t i o n 781
# 29412 Cust: AddisonWesley Au: Bar-Yam Pg. No. 781
Title: Dynamics Complex Systems
Shor t / Normal / Long
Bar-YamChap8.pdf 3/10/02 10:53 AM Page 781

700

Hu ma n Civ il iz a tion I

This subject is distinct from the others we have considered. The primary distinction is that we have only one example of human civilization. This is not true about the systems we have discussed in earlier chapters, with the exception of evolution considered globally. The uniqueness of the human superorganism presents us with questions of fundamental interest in science, related to how much we can know about an individual system. When there are many instances, we can use information provided by various examples and the statistics of their properties. When there is only one system, to understand its properties or predict its behavior we must apply fundamental principles that are valid for all complex systems. Since the field of complex systems is dedicated to uncovering such principles, the subject of the human superorganism should be considered a premiere area for application of complex systems research. Central questions are:How can we characterize this complex system? How can we determine its properties? What can we tell about its dynamics—its past and future? We note that as individuals we are elements of the human superorganism, thus our spatial and temporal experience may very well be more limited than that appropriate for analyzing the human superorganism. The study of human civilization is guided by historical records and contemporary news. In contrast to protein folding , neural networks, evolution and developmental biology there are few reproducible laboratory experiments. Because of the irreproducibility of historical or contemporary events,these sources of information are properly not considered part of conventional science. While this can be a limitation, it is also apparent that there is a large amount of information available.Our task is to develop systematic methods for considering this kind of information that will enable us to approach questions about the nature of human civilization as a complex system. Various aspects of these problems have been studied by historians, anthropologists and sociologists. Why consider human civilization as a single complex system? The recently discussed concept of a global economy, and earlier the concept of a global village, suggest that we should consider the collective economic behavior of human beings and possibly the global social behavior as a single system. Considering civilization as a single entity we are motivated to ask various questions about it. These questions relate to all of the topics we have covered in the earlier chapters: spatial and temporal structure, evolution and development. We would also like to understand the interaction of human civilization with its environment. In developing an understanding of human civilization, we recognize that a widespread view of human civilization as a single entity is relatively new and dr iven by contemporary developments. At least superficially, the historical epoch described by the dominance of nation-states appears to be quite different from the present global economy. While recent events appear to be of particular significance to the global view, our questions must be addressed in a historical context. Thus we should include a discussion of the transition to a global economy. We postpone this historical discussion to the next chapter because of the groundwork that we would like to build in order to target a particular objective f or our analysis—that of complexity classification.

M o t i va t i o n

701

We are motivated to understand complexity in the context of our effort to understand the nature of the human superorganism, or the nature of the global economy. We would like to identify the type of complex system it is—to classify it. The first distinction that we might make is between a complex material or a complex organism (see Section 1.3.6). Could part of the global system be modified without affecting the whole? From historical evidence discussed in the next chapter, the answer appears to be no. This indicates that human civilization is a complex organism. The next question we would like to ask is: What kind of complex organism is it? By analogy we could ask: Is it like a protein, a cell, a plant, an insect, a frog, a human being? What do we mean by using such analogies? At least in part the problem is to describe the complexity of an entity’s behavior. Intuitively an insect is a simpler organism than a human being, and this is of qualitative importance for our understanding of their differences. The degree of complexity should provide a scale that can distinguish between the many different complex systems we are familiar with. Our objective in this chapter is to develop a quantitative definition of complexity and behavioral complexity. We then apply the d efinition to various complex systems. The focus will be on the complexity of an individual human being. Once we have established our complexity scale we will be in a position to apply it to human civilization. We will understand formally why a collection of complex systems (human beings) may be, but need not be, complex. Beyond recognizing human civilization as a complex system,it is far more significant to identify the degree of its complexity. In the following brief sections we establish some additional context for the importance of measuring complexity using both unconventional and conventional examples of organisms whose complexity should be evaluated.

8.1.2 Scenario: alien encounter
The possibility of encountering alien life has been debated within the scientific community. In popular literature, such encounters have been portrayed in various forms ranging from benevolent to catastrophic. The scientific debate has focused thus far on topics such as the statistics of planet formation and the likelihood that planets contain life. The presence of organic molecules in meteorites and interstellar gasses has been interpreted as suggesting that alien life is likely to exist.Efforts have been made to listen for signs of alien life in radio communications and to transmit information to aliens using the Voyager spacecraft, which is leaving the solar system marked with information about human beings. Thus far there has been no scientifically confirmed evidence for the existence of alien life. Even a single encounter would change the human perspective on humanity’s place in the universe. Let us consider one possible scenario for an encounter. An object that flashes light intermittently is found in orbit around one of the planets of the solar system. The humans encountering this object are faced with the question of determining whether the object is: (a) a signal device—specifically a recording, (b) a communication device, or (c) a living organism. The central problem can be seen to revolve around determining whether, and in what way, the device is responsive to external phenomena. Do the flashes of light occur without regard to the external environment

702

Human Civilization I

in a predetermined sequence? Are they random? If the flashes are sensitive to the environment,then what are they sensitive to? We will see that these questions are equivalent to the question of determining the complexity of the object’s behavior. The concept of life in biology is often defined, or better yet, characterized, in terms of consumption, excretion and reproduction. As a definition, these characteristics are well known to be incomplete, since there are life-forms that do not reproduce, such as the mule. Furthermore, a particular individual is still considered alive even if it/he/she does not reproduce. Moreover, there are various physical systems such as crystals and fire that have all these characteristics in one form or another. Moreover, there does not appear to be a direct connection between these biological characteristics and other characteristics of life such as sentience and self-awareness. When considering behavior, the biological perspective emphasizes the survival instinct as characteristic of life. There are exceptions to this,since there exist life-forms that are at times suicidal, either individually or collectively. The question of whether an organism actively seeks life or death does not appear to be a characterization of life but rather o f life-forms that are likely to survive. In our discussions, we may be developing an additional characterization of life in terms of behavioral complexity. Definitions of life are often considered in speculating about the rights of and treatment of real or imagined organisms—injured or unconscious humans, robots, or aliens. The degree of behavioral complexity is a characterization of life-forms that may ultimately play a role in informing our ethical decisions with respect to various biological life-forms, whether terrestrial or (if found) alien, and artificial life-forms that we create.

8.1.3 Scenario: blood cells
One of the areas briefly touched upon in Chapter 6, which is at the forefront of complex systems research, is the study of the immune system. Blood cells,unlike other cells in the body, are mobile on a length scale that is large compared to their size. In this characteristic they are more similar to independent organisms than to the other cells of the body. By their migration they might be said to “choose” to associate with other cells of the body, or with foreign chemicals and cells. It is fair to say that our understanding of the behavior of immune cells remains primitive. In particular, the variety of possible chemical interactions between cells has only begun to be mapped out. These interactions involve a variety of chemical messengers. More direct cell-to-cell interactions where parts of the membrane or cellular fluid are transferred are also possible. One of the interesting questions that can be asked is whether, or at what level of complexity, the interactions become identifiable as a form of language. It is not difficult to imagine, for example, that a chemical communication originating from one cell might be transferred through a chain of cell interactions to a number of other cells. In the context of the discussion in Section 2.4.5, the question of existence of a language might be formulated as a question about the possibility of messages with a grammar—a combinatorial composition of parts that are categorized like parts of speech. Such combinatorial mechanisms are known to exist even at the molecular level in the DNA coding of antibody receptors that are a composite of different parts

2 they will be used to treat complexity in the context of mathematical objects such as character strings.9.3) and various extensions (e. This is descriptive complexity. A limited understanding establishes a lower bound for the complexity of the behavior.
8. The amount of information necessary to describe this response is a system’s behavioral complexity.4 Complexity
Mathematical definitions of the complexity of systems are based upon the theories of information and computation discussed in Sections 1. In Section 8.M o t i va t i o n
703
of the genome.Section 1. Intuitively. The quantitative definition of information (Section 1. it can be measured in familiar terms such as by the number of characters in a text.1 includes a list of systems that are designed to stimulate some thought about complexity as a quantitative measure of the behavior of a system. To use these definitions of complexity we will introduce mathematical expressions based upon the theory of information. We achieve understanding in a number of ways.4 we use several semiquantitative approaches to estimate the value of the complexity of specific systems. whether personally or in a scientific context. How complex is it? We say. The understanding enables us to use. It should also be understood that different types of cells will most likely have quite different levels of behavioral complexity. To develop our understanding of the complexity of physical systems requires that we relate these concepts to those of thermodynamics (Section 1. The reader should devote some thought to this question before proceeding with the rest of the text. control or appreciate it. In Section 8.
. Its complexity is <number><units>. just as different animals and man have differing levels of complexity.Question 8.. In the context of this chapter we can reduce the questions about the immune cells to a single one—What is the degree of complexity of the behavior of the immune cells? By its very nature this question can only be answered once a complete understanding of immune cell behavior is reached. When we encounter something new. As a preliminary exercise in the discussion of complexity.g.3 we discuss relevant concepts and tools that may be used for this purpose.8) is relatively abstract. We will also discuss the response of a dynamic system to its environment. the reader is invited to exercise intuition to estimate the complexity of a number of systems. Our use of the word “complexity”is specified as an answer to the question. Simply stated. the complexity of a system is the amount of information necessary to describe it. In Section 8. However. description and ultimately through the ability to predict behavior. our objective is to understand it.modify.4) that enable the treatment of nonequilibrium systems.1.1. we can make a connection between complexity and understanding. The practical application of these definitions is a central challenge for the field of complex systems. through classification.8 and 1. It remains to be seen whether intercellular communication is also generated in this fashion. Our objective in this chapter is to show that it is possible to quantify the concept of complexity in a way that is both natural and useful. Complexity is a measure of the inherent difficulty to achieve the desired understanding. For dynamic systems the description includes the changes in the system over time.

Considering even a few of them is sufficient to develop an understanding of some of the issues that arise. Indeed.1 Estimate the complexity of some of the systems in the following list. Hint You may find that you would use different amounts o f information depending on what aspects of the system you are describing. However. you may use other convenient units such as words or pages of text. It is not necessary to estimate the complexity of all the systems on the list. How much would you have to write to describe the system behavior? A rough conversion factor of 1 bit per character can be used to convert these estimates to bits. P = 1atm) Water in a glass Chemical reaction Brownian particle Turbulent flow Protein Virus Bacterium Immune system cell Fish Frog Ant Rabbit Cow Human being Radio Car IBM 360 Personal Computer (PC/Macintosh) The papers on your desk A book
. Answers to this question will be given in the text in the remainder of this chapter.704
H uma n Ci vi liza t io n I
uestion 8. For this question use an intuitive definition of complexity—the amount of information that would be required to describe the system or its behavior. to make it easier to visualize. In such cases try to give more than one estimate or a range of values. We use units of bits to measure information.1.
Q
Physical Systems:
Ideal gas (1 mole at T = 0°K. we can paraphrase the question as. So. for some of these systems a rough estimate is far from trivial.

The information from a particular string was defined as I = −log(P(s)) (8.2)
.1) where P(s) is the probability of the string in the ensemble. bifurcation to chaos) 1-D random walk short time long time Ising model (ferromagnet) Turing machine Fractals Sierpinski gasket 3-D random walk Attractor neural network Feedforward neural network Subdivided attractor neural network 
8.Our objective is to understand the complexity of systems composed of physical entities such as atoms.2
Complexity of Mathematical Models
Complexity is a property of the relationship between a system and various representations of the system. by selecting it from an ensemble.2. computation and algorithmic complexity
The discussion of Shannon information theory in Section 1. s. Abstract representations of such systems are described in terms of characters or numbers.8 was based on strings of characters that were generated by a source. It is helpful to preface our discussion of physical systems with a discussion of the complexity of the characters or numbers that we use to represent them. The source generates each string. If all st rings have equal probability then this is the logarithm of the number of distinct strings.
8. The source itself (or the ensemble) was char acterized by the average information of a large number of strings <I > = −
∑ P(s)log(P(s))
s
(8.2.C o m p l e xi ty of m at h em at i ca l m odel s
705
A library Weather The biosphere Nature
Mathematical and Model Systems:
A number Iterative maps (growth.molecules or cells.1 Information.2.

Other models for computation have been shown to be essentially equivalent to these TM. the encodings would be limited to compression using a Markov chain model. The word “program” can be used either to refer to the TM table or to its input and so its use is best avoided in this context. such that the output is s.9) describes the operations of logic and computation on symbols. we could use any possible algorithm for encoding (compressing) the string. We need to take both of them into account to define the complexity. In order to use this as a definition. Our objective in this section is to develop an understanding of algorithmic complexity beginning from the theory of computation. The short-
.706
Hu ma n Civ il iz a tion I
It was also possible to consider a more general source that selected characters to form a Markov chain. it is helpful to think about how we might approach compressing various strings of characters. more generally. Within standard information theory. we can prove that any two definitions of complexity differ by no more than a constant. the input string and the table.there is no complete proof. s.” Allowing all algorithms is the same as allowing more general models for the string than a Markov chain.A TM is defined by a table of elementary operations that act on the input string. The concept of universality of computation is based on the understanding that a particular type of conceptual machine/computer—the universal Turing machine (UTM)—can perform all possible computations if the instructions are properly encoded as a finite string of characters serving as the UTM input.The probabilistic coupling between sequential characters reduced the information content of the string. All the operations are deterministic and are expressible in terms of a few elementary operations. This is intuitive—since the original string can be obtained from its shortest representation. however. However. The relationship of this to the encoding and decoding of Shannon should be apparent. We would like to define the algorithmic complexity of a string. Computation theory (Section 1. There are many ways to define complexity. It was possible to compress the st ring using a reversible coding algorithm (computation) that would enable the same information to be represented in a more compact form. Questions about all possible algorithms are precisely the domain of computation theory. To summarize: There are actually two sources of information when we use a TM. The information content is the same as the length of the shortest binary encoding of the string. The length of the shortest binary compact form is equal to the average information in a string. In order to motivate the logic of the following discussion. as the length of the shortest possible binary TM input. The definition of Kolmogorov (algorithmic) complexity of a string makes use of computation theory to describe what we mean by “any possible algorithm. We will also show that no matter what definition we use.the same information must be present in both. Since we have no absolute definition of computation. Information theory suggests that we can define the complexity of a string of characters by the information content of the string. The existing proof shows that the UTM can perform all computations that can be done by a much larger class of machines—the Turing machines (TM). most strings cannot be compressed.there are several matters that must be cleared up.

One string might be formed out of a long substring of zeros followed by a long substring of ones.…). Let us say that we have a UTM U and a TM V.2.1. The st ring that results from its application to a tape is indicated by U(s) where s is the nonblank portion of the tape (input string). The central theorem of algorithmic complexity relates the complexity according to one UTM U and another UTM U′. In what follows.and the initial position of the TM head is assumed to be at the leftmost nonblank character. The simplest way to do this is to assign numbers to each of the programs and preface the program input with the program number. we identify a particular UTM U. We first ask whether we need to use a UTM and not just any TM in the definition.We could write this using a similar notation as the previous one. the operation of a TM or a UTM will be indicated by functional notation.and we cannot significantly improve the ability to compress strings by allowing the larger class of TM to be used in the definition. Both programs would be quite simple. Thus the length of the shortest representation is CU (s). Then the complexity CU (s) of the string s is defined as the length of the shortest string r such that U(r) = s. Once we do this.2. The answer is that the use of a UTM is convenient.3. Before we state and prove the theorem. the string that we send uniquely determines the string we wish to communicate. we might be clever and send the programs only once. but the program that we would write to generate the string is quite different. the recipient would not know which program to apply to obtain the o riginal string. Since the complexity according to the UTM W is at most one more than the
. starting from the N0st number and ending at the N1st number. We would make a binary string notation for N0N1 and write a program that would read this input and then output the original string.then it would be impossible to guarantee a correct interpretation. This is convenient to write by indicating how many zeros followed by how many ones: N0N1. We now develop these thoughts using a more formal notation. This is necessary.8. We call an input string r to U that generates s a representation of s. We need to send an additional piece of information that indicates which program to apply. we discuss several incidental matters. Now imagine that we want to communicate one of the original strings to someone else. Another string might be a representation of the Fibonacci numbers (1. we define a new UTM W by: W(0s) = V(s) W(1s) = U(s) (8. The problem is that with only the input string. In order to define the complexity of a string. If there were many strings. we would have to send the program as well as the input. U is the identifier of the TM.5.C o m p l e xi ty of m at h em at i ca l m odel s
707
est compression should then be the complexity o f the string. because if the interpretation of the transmitted string is not unique.3)
—the first character indicates whether to use the TM V or the UTM U on the rest of the input. If we want to communicate it in compressed form.

we have to limit the UTM that are allowed.U ′rU ′| = |rU ′| + |rU. It is shown there that we can preface binary strings input to the UTM U′ with a prefix that will make them generate the same output when input to U.5) where CU (U′) is independent of the string s. We might call this prefix rU. because if we wanted to communicate the string s we would have to indicate all of r.4)
—the first character indicates whether the string is compressed. To prove this we must improve slightly our definition of complexity. However. The length of this string must be greater than or equal to the length of the minimum string rU necessary to produce the same output: CU (s) = |rU | ≤ |rU. if we wanted to impose this as an auxiliary condition.6) CU (U′) = |rU. there is a string rU so that for all s and r satisfying s = U(r). The proof of this expression results from the ability of the UTM U to simulate U′. is related only to the probability of a string.U ′ r) = U ′(r).U ′ | is the length of the translation program. Indeed the definition does not require it.2.2. we note that the Shannon information. which would be a sacrifice of at most one bit for incompressible strings.2. CW (s) ≤ CV (s) + 1. we define a new UTM V such that V(0s′) = s′ V(1s′) = U(s′) (8.
.U ′| = CU ′(s) + CU (U ′) (8.1–8.2. The key theorem that we need to prove relates the complexity defined using one UTM U to the complexity defined using another UTM U′.5). we have described the existence of a shortest possible representation of any string s. Given a UTM U. However. or equivalently. This is discussed in Questions 8. Then rU.U ′ a translation program.3. Returning to our basic definition of complexity. and a single machine U that can reconstruct each s from this representation. (8. This is not quite a fair definition. We have proven the inequality in Eq. We may be disturbed that the definition of complexity does not indicate that the complexity of an incompressible string is the same as the string length itself. U(rU. (8.1). which is not significant for long complex strings. Eq.it satisfies the property that for any string r. we have that s = U0(rU r) . Let rU ′ be a minimal representation for U′ of the string s.2.2.2. and may be larger than the original string length for a particular string.U ′rU ′ is a representation for U of the string s. This means that we should define the complexity as the length of r. We then define the complexity CU (s) of any string s as one less than the length of the shortest string r such that V(r) = s. we see that using the larger class of TM to define complexities can not improve our results for any particular string by more than one bit. we could define the complexity of a string using a slightly different construction. Limiting the complexity of a string to be no longer than the string itself might seem a natural idea.2.708
Human Civilization I
complexity according to the TM V.1 Show that there exists a UTM U0 such that for any TM U that accepts binary input. including its first bit.
Q
uestion 8. The theorem is: the complexity CU based on U and the complexity CU ′ based on U′ satisfy: CU (s) ≤ CU ′(s) + CU (U′) (8.

the modification is minor because we only improve the definition slightly.2.1 We call the UTM described in Section 1. a trick.2 Using the result of Question 8. the form of the input string would not ˜ quite satisfy the conditions of this theorem.e. Then there would be no delimiters between characters and no doubled binary representation.C o m p l e xi ty of m at h em at i ca l m od els
709
Hint One way to do this is to use a modified form of the construction given in Section 1. There are a number of ways to overcome this problem. however. There is.2. This was accomplished in Section 1.3) so we need to modify our conditions. We can sim˜ ulate the UTM U using U0 .9.
Solution 8. ˜ Both U0 and U have binary input strings.(8.9 by converting one of the M1 markers to M 6 at the current location of the UTM U. 
Q
uestion 8. however. ˜ Solution 8. CU (s) defined this way is at most one bit more than our previous definition.7)
—the first bit specifies whether to use U or the special UTM U0 constructed in Question 8.2.2.2. This counter is initialized to 0 and set to the current location of the UTM U at every step of the calculation.9. The new construction requires modifying the nature of the UTM—i. We will do this by allowing the UTM U0 to have a counter that can keep track of the current position of the UTM U.5) is not actually correct for all UTM (see Question 8. The UTM U0 must keep track of where the current position of the UTM U would be during the same calculation. We do this by defining the complexity CU (s) for an arbitrary UTM as the minimum length of r such that W(r) = s where W is defined by: W(0s) = U0(s) W(1s) = U(s) (8. but all require us to introduce something new..2 The problem is that Eq. the tape part of the representation rt (r) uses a doubled binary form for characters and markers between them so that it is not the same as the original tape.One is to allow. where the right part is only a function of the input string r and the left part is only a function of the UTM U.9. This means that we can use an internal memory of 300 bits to represent such a counter. prove Eq. one diffi culty. a counter that can reach arbitrarily high numbers.2.2. See the text for a hint. It might be significantly
. for any particular string.2. or very roughly 1090 = 2300.5). In a sense. This construction gives us the desired UTM U0. U0.(8. This means that we might try to use the tape of U without modification in the tape part of the representation given in Section 1. We must replace the tape part of the representation with the original string in order to have an input string of the form rU r. U0 has an input that looks like rU rt (r). There are two ways to argue this. The other is to recognize that the longest string we might conceivably encounter is smaller than the number of particles in the known universe. by proclamation.2. However.1.1.

8) we have proven that complexities defined by the UTM differ by no more than CU.2. A useful compression algorithm corresponds to a pattern in the characters of the string. we guarantee that for any two UTM U and U′. (8. This should not be a problem. where each one is represented by a TM that reconstructs a string s from a shorter string r by taking advantage of properties of the pattern.U′: (8.5) gives a similar inequality with a constant CU ′(U ). A string might have many repetitive digits.(8.2. define a UTM U that acts the same as a UTM U′ but uses only every other character in its input string: U(r) = U′(r′) if r is any string whose odd characters are the characters of r′.7).2. We will use a few examples to illustrate the nature of universality provided by this definition.Given a string s we can ask what methods of compression are useful for the string.CU ′(U)) |CU (s) − CU ′(s)| ≤ CU. There are many such patterns that are relevant to the compression of strings.  Switching U and U ′ in Eq. This consistency—universality—in the complexity of a string is essential in order for it to be well defined.2. By using our special UTM U0 in this definition. or the digits of . The complexity of a string according to U is twice the complexity according to U′ and therefore Eq. we can write W(rWW ′rW ′) = W′(rW).710
H uma n Ci vi liza t io n I
smaller. because our objective is to find short representations of strings. The complexity defined by one UTM is the same as the complexity defined by another UTM. Thus.U ′ = max(CU (U′). This is possible because W inherits the properties of U0 when the first character of its input string is 0.2. Therefore. With the modified definition of complexity given in Question 8.2.5) is necessary by demonstrating that there exists a UTM that does not satisfy this inequality. or cyclically repeating digits. (8. 
Q
uestion 8.5) is invalid in this case.5) cannot be extended to all UTM. it doesn’t matter which UTM we use to define its complexity. (8.9) Since this constant is independent of the complexity of the string s.
Solution 8. for strings that are complex enough.2. Eq.3 One possibility is to have a UTM that uses only certain characters in its input string.U ′ (8.2 this is no longer a problem. it becomes insignificant for large enough complexities. We can choose a finite set of N algorithms {Vi }.2. (8.2.3 Show that some form of qualification of Eq.2. Alternatively. The first example illustrates the relationship of algorithmic complexity to string compression.10)
. whose complexity is defined in terms of W and W ′ by Eq. Specifically. Defining the larger of the two translation program lengths to be CU. it might be a sequence that can be generated using simple mathematical operations such as the Fibonacci series.2. We now construct a new TM U which is defined by: U(ri r ′) = Vi (r′) (8.

strings of length N − 1 N − 2. this apparent relativism of the complexity is limited by our basic theorem that relates the complexity of distinct UTM. The choice of a particular UTM might be dictated by an implicit understanding of the set of strings that we would like to represent. Conceptually. We can use this new UTM to define the complexity of all strings and for this definition the complexity of s is one. This is that a representation r only represents one string s. ….01% of the string length. this estimate of the average number of strings that can be compressed is much too large. For example. having log(N) bits. even though the complexity of a string is defined without reference to an ensemble of strings.and by additional results about the impossibility of compressing most strings discussed in the following paragraphs. equivalently. one UTM is used to define the complexity of all strings.. We use U to define the complexity CU (s) of any string as described above. Moreover.This is not a very significant compression. We define a new UTM Us by: Us (0s′) = s Us (1s′) = U(s) (8.this complexity is a measure of the complexity of all strings.3). We see in this example how universality is tied to an assumption that the complexities that are discussed are longer than the TM translation programs or. The proof proceeds from the observation that the number of possible strings decreases very rapidly with decreasing string length. A string s of length |s| = N compressed by k bits is represented by a particular string r of length |r | = C(s) = N − k. We have gained an additional result from the construction of a single UTM that generates all strings from their compressed forms.let us assume that we are evaluating the complexity of a particular string s.at most 1 string in 2100 = 1030 can be compressed by 100 bits or . because strings that are not of length N. How does this relate to our the orem about the universality of complexity? The point is that in this case the translation program between U and Us contains the complete information about s and therefore must be at least as long as CU (s). Despite the message of the last example. We do not use different TM to define the complexity of each string. What we have done is to take the particular string s and insert it into the table of Us . We can now prove that the probability that a string of length N can be compressed is very small.C o m p l e xi ty of m ath em ati cal m od els
711
where ri is a binary representation of the number i. N − k. The fractional compression is k/N. would also be represented by strings of length N − k. the information in their tables. However. Even so.11)
—the first character tells Us if the string is s.
. Once it is defined.g. (8. Thus most strings are incompressible.among all st rings of length 106 bits. selecting a string at random will yield an incompressible string. at most 2N− k strings of length 2N can be compressed by k bits. Since there are only 2N−k strings of length N − k. e.2. we would say that universality of complexity is tied to an assumption of lack o f specific knowledge on the part of the recipient (represented by the UTM) of the information itself. This complexity includes both the length of r′ and the number of bits (log(N)) in ri that together constitute the length of the input r to U.2. This is a UTM if any of the Vi is a UTM or it can be made into a UTM by Eq.

12)
means that we will fill all of the possible strings up to length N − 1 and then have one string left of length N. The catch is recogniz-
. The UTM is the decoder and the mapping of the string onto its representation is the encoding.4 We assume that strings of length N are compressed so that they are represented by all of the shortest strings.13)
The sum can be evaluated using a table of sums or:
N −1 l =0
∑
l2l =
1 d ln(2) d
N −1 l =0
∑
2
l =1
=
1 d 2 N −1 ln(2) d 2 − 1
= N 2N − 2(2N − 1)
=1
(8. One string is represented by the null string (length 0). and so on.2. the ensemble of all of the strings of length N have a Shannon information of N bits and an average algorithmic complexity which is the same.  We can also interpret this discussion to mean that the best UTMs to use to define complexity are those that are invertible—they have a one-to-one mapping of strings to representations.2. The average representation length for any complexity measure must then satisfy: < C(s) > ≥  1  N −1 l l2 + N  N  2  l=0 
∑
(8. In this case we have a mapping r(s) which gives the unique representation of a string.15)
Thus the average complexity o f strings of length N cannot be reduced by more than two bits. In particular. then the average algorithmic complexity of these strings is essentially the same as the Shannon information.4 Calculate a strict lower bound for the average complexity of strings of length N. if we use up more than one of them for a particular string. then we will have fewer representations to use for others. The relationship: 2N =
N −1 l=0
∑ 2l +1
(8.2.2.
Solution 8.712
H uma n Ci vi liza t io n I
Q
uestion 8. we can also prove that if we have an ensemble of strings defined by the probability P(s).2. two strings are represented by a single bit (length 1). Such UTM are closely analogous to our understanding of encoding and decoding as described in information theory.2. This strict lower bound applies to all measures of complexity.14) giving: < C(s) > ≥ ( N − 2) + 1 2N (N + 2) > N −2 (8. The reason that such UTM are better is that there are only a limited number of representations shorter than N. Because most strings are incompressible.

If the complexity of the string is bounded. Nevertheless. At least this is true when there is a bound on the complexity.9. A model is a TM that might. it can be proven that this is a fundamentally difficult task—the time necessary for a TM to determine C(s) grows exponentially with the length of s. The proof follows from the discussion in Section 1. and therefore is not practical except for a few very simple strings. While the definition of complexity using UTM is appealing. A general consequence of the definition of algorithmic complexity is a limitation on what TM can do. We now realize that to define the st ring complexity we must include the description of the decoding operation:
∑ P(s)C(s) = ∑ P(s)I s + C(P)
s s
(8. In this context. As discussed briefly in Section 1.C o m p l exi ty of m at hem at ica l m o dels
713
ing that to specify P(s) itself requires an algorithm whose complexity must enter into the discussion. when given the
. trying each string requires a time that grows exponentially with the bound.9. Indeed. This is a key limitation of TM: TM (and computers that are realizations of this model) cannot generate new information.7).2.7. It is nonconstructive. then we only try strings up to this bound.4). No method is given to determine the complexity of a particular string..16)
where the expression C(P) represents the complexity of the decoding operation for the universal computer U for the ensemble given by P(s).8. If the complexity is not bounded. it may also be suggested that some forms of creativity might be linked to the availability of randomness (see Section 1. For the average ensemble complexity to be essentially equal to the average Shannon information.thus it is noncomputable. It remains to be demonstrated what tasks such a TM can perform that are not possible for a conventional TM.the specification of the ensemble must itself be simple.2.g. Otherwise the complexity is noncomputable. They can only process information they are given. No TM can generate a string more complex than the input string that it is provided with. plus the information in its table—otherwise we would have redefined the complexity of the output string to take this into consideration. e. C(P) depends in part on the algorithm used to specify the ensemble probability P(s). then the halting problem implies that we cannot tell if the UTM will halt on a particular input. For Markov chains a similar result applies—the Shannon information of a string representing a Markov chain is the same as the algorithmic complexity of the same string. there is a profound difficulty with this proof. We find the complexity of a string by trying all input strings in the UTM to see which one gives the necessary output. An ensemble defined by a probability P(s) can be encoded in such a way that the average string length is given by the Shannon information. by Eq. and it is possible to determine if the UTM will halt for members of this bounded set of strings. The infinitely complex input means the limitation does not apply. (8. The process of finding the complexity of a string is akin to a process of trying models for the string. this limitation can be overcome by a TM that is given a string of random bits as input.there will be important implications for computer design. We will return to this issue at the end of the chapter. as long as the algorithm specifying the Markov chain is simple. If such tasks are identified.

There appears to be a paradox here that will be clarified when we distinguish between the complexity of a set of numbers and the complexity of an element of the set. roughly the number of elementary particles in the known universe—the complexity of specifying one of the integers is only log(N). These conventions. This follows because the length of translation programs becomes less and less relevant for longer and longer descriptions/representations. and there is no limit to the information required. the universality of complexity is a statement that the use of different UTMs in the definition of complexity affects the result by no more than a constant.
8.unless otherwise mentioned. by our discussion it is improbable that a randomly chosen string will be compressible by any algorithm. Thus the c omplexity of specifying a single integer is infinite. Unlimited numbers and infinite precision often simplify symbolic mathematical discussions. to determine the actual compressed string may not be practical in any reasonable time. in our mathematical definition. the more universal is the value of its complexity.714
Hu man Civ il iz a tion I
proper input. Most o f the problems revolve around various forms of infinity.say N. We also showed that most strings are not compressible and that the Shannon information measure is the same as the average algorithmic complexity for all concisely describable ensembles.are represented by the choice of UTM used to define complexity.about 300 bits. The drastic difference between the
. However. Significantly. if we allow only integers between 1 and a large positive number—say N = 1090. This constant is the length of the program that translates the input of one UTM to the other.and they have been extensively debated over the centuries. Since we are interested in properties of complex systems whose descriptions are long. Algorithmic complexity allows more general TM models.2 Mathematical systems: numbers and functions
One of the difficulties in discussing complexity is that many elementary mathematical constructs have unusual properties when considered from the point of view of complexity. from 1 to infinity with equal probability. generate the string. Philosophers have been troubled by these points. we can. we assume a particular definition of complexity C(s) using the UTM U. The difficulty with integers is that there are infinitely many of them. find an upper bound on the complexity of a string. This is not the case with simple systems whose descriptions and therefore complexities are “subjective”—they depend on the conventions for description. the more complex the string is. In what follows. assigning equal probability to all integers would imply that any particular integer would have no probability of occurring. However. however. Let us consider the complexity of specifying a single integer. If I ask you to give me a positive integer. It is possible to try many models. With any particular set of models. One of the possible models is that of a Markov chain as used by Shannon information theory. In summary.however. rely on the universality of their complexity. However. with caution. we can.2. This means that you will need arbitrarily many digits to specify the integer. there is no chance that you will give me an integer below any particular cutoff value. they are not well behaved from the point of view of complexity measures. Using an information theory point of view.

The mathematical operations might act upon integers. This is apparent since these theorems do not apply to finite sets.punctuation. The algorithmic complexity of the set of all integers is small even though the information contained in a single integer can be arbitrarily large.however. starting from zero and keeping a list. if we confine ourselves to any reasonable precision. Thus. This distinction between the information contained in an element of a set and the information necessary to define the set will also be important when we consider the complexity of physical systems. and the task is not complete. Another way to do this is to consider the complexity of recognizing an integer—the recognition complexity. However. the practical complexity of a real number is not very large. The measure of complexity of specifying a single integer may appear to be far from more abstract discussions like those of the halting problem or Gödel’s theorem (Section 1. mathematical operations. If we consider 1 − e / B we immediately lose 3 decimal digits.however.Generally. the most accurately known fundamental constant in science is the electron magnetic moment in Bohr magnetons
e/ B
= 1. In what sense are integers simple? We can consider the length of a UTM input string that can generate all the posit ive integers. rather than the specification of a particular integer. the complexity becomes very manageable. The length of such a program is also small. The program would.
. The point. by definition. We can generalize our definition of a Turing machine to allow for this case by saying that. because all of them represent integers. A program that recognizes integers is concerned with the attributes of the integers required to define them as a set. this is similar to the definition of their Kolmogorov or algorithmic complexity. corresponding to 11 accurate decimal digits or 37 binary digits. The discussion of integers and reals suggests that under practical circumstances a single number is not a highly complex object. Recognizing an integer is trivial if we are considering only binary strings. similar to integers.17)
where the parenthesis indicates the error estimate. Then the algorithmic complexity of the integers is quite small.2. is that we can expand the space of possible characters to include various symbols:letters. progressively add one to the preceding integer. The complexity of a single real number is also infinite.5). The problem is that such a program ne ver halts. We then ask how long is a TM program that can recognize any integer that appears as a combination of such characters. the complexity of a system arises because of the presence of a large number of parameters that must be specified. etc.C o m p l exi ty of m ath em ati cal m od els
715
complexity of specifying an arbitrary integer (infinite) and the complexity of an enormously large number of integers (300 bits) suggests that systems that are easy to define may be highly complex.9. Specifying an arbitrary real number requires infinitely many digits. The whole field of number theory has shown that integers are not as simple as they first appear. For example. this simple program is generating all integers. As discussed in the last section.001159652193(10)
(8. We see that we must distinguish between the complexity of elements of a set and the set itself. they are related.

It arises because we need to specify for each possible input which of the possible outputs is output. This discussion will be generalized later to consider a physical system that acts in response to its environment. All Boolean functions may be specified by listing the binary output for each possible input state. f (s) = ±1. We consider Boolean functions (functions with binary output.2.3. For each of these there are two possible outcomes (output values).2.716
Hu ma n Civ il iza t io n I
However. Each possible output is independent. When we c onsider algorithmic complexity. If we assume that all possible combinations of Boolean functions are equally likely. The number of different Ne Boolean functions is the number of possible sets of outputs which is 22 . There are 2N e possible values of the input string. . we can ask whether this is the smallest amount of information that might be used. Assuming that all of the possible Boolean functions are equally likely. and its actions will be specified by a number of binary variables (action complexity) Na. or C( f ) = N a 2 N e (8. see Section 1.sNe ).2.9. s = (s1s 2 . The next category of mathematical objects that we consider are functions. The environment will be specified by a number of binary variables (environmental complexity) Ne . The resulting information measure is essentially that of Shannon information theory.3.3. Specifying “which” is a logarithmic operation in the number of possibilities. To specify a function f (s) we must either describe its operation by a formula or specify its action on each possible argument. we must develop a fundamental understanding of representations. Section 8. are discussed in Section 8.18)
The asymmetry between input and output is a fundamental one.3.1 we discuss the relationship between thermodynamics and information theory. of a binary string. then the total complexity is the sum of the complexity of each. This will enable us to define the complexity of ergodic and nonergodic systems.9.3
Complexity of Physical Systems
In order to apply our understanding of the complexity of mathematical constructs to physical systems.4. for chaotic dynamics.3. there is only reason to consider them collectively as a system if they are coupled to each other. Section 8. . and therefore the influence of the ou tput space on the complexity is logarithmic compared to the influence of the input. This is discussed in Section 8. The representation of a Boolean function in terms of C(f ) binary variables can also be made explicit as a string representing the presence or absence of terms in the disjunctive normal form described in Section 1.3 introduces the complexity profile. the complexity of a Boolean function (the amount of information necessary to specify it) is the logarithm of this number or C(f ) = 2Ne . Implications of the time scale of observation.2). The number of arguments of the function—input bits—is Ne .
8.5
. which measures the complexity as a function of the scale of observation. The complexity of a physical system is to be defined as the length of the shortest string s that can represent its properties—the results of possible measurements/ observations. In Section 8. A binary function with Na outputs is the same as Na independent Boolean functions.

Information is not a unique physical property. We can relate the two definitions in a mathematically direct but conceptually significant way.V )) = S(U. If we want to specify a particular microstate of a thermodynamic system. or are in well-defined ensembles. We assume that all states (microstates) of the system with this energy. N.3. To better account for the behavior of a system in response to its environment we consider behavioral complexity in Section 8. If we think about the state of the system as a message containing information. of the system.3. Sections 8. Information was defined for a string of characters.which specifies the macroscopic energy U. would be largely a property of the paper or the ink. Other issues related to the role of the observer are discussed in Section 8.7. it makes sense that the two are related when we develop an understanding of complexity.3. For example. number of particles N. the formal definition of information discussed in Section 1.1 Entropy and the complexity of physical systems
The definition of complexity of a system requires us to develop an understanding of the relationship of information to the physical properties of a system.1).C o m p l ex it y of p hys ica l sys t e m s
717
discusses examples and properties of the complexity profile.1) to give the amount of information as: I({x. Given the probability of the string of characters.V )/(k ln2) (8. We see that the information content is related to selecting a single state out of an ensemble of possibilities. At the outset. we must select this microstate from the whole ensemble.2. the information is defined by Eq.8 appears very similar to the definition of entropy discussed in Section 1. The logarithm is taken to be base 2 so that the information is measured in units of bits. the information content of a set of characters written on a piece of paper can be given.N. Thus.p}|(U. The most direct relationship is the relationship of entropy and information.Entropy is a specific physical property of systems that are in equilibrium. Information can be a property of a time sequence or any other set of degrees of freedom.3.N.3.3. (8. but simpler substances have entropies that have been determined and are tabulated at specific temperatures and pressures. number of particles and volume are equally likely in the ensemble.it should be understood that these are very different concepts. The entropy was defined first for the microcanonical ensemble.2.V ) (8.3.3. The probability of this particular state is given in the microcanonical ensemble by P = 1/ . The entropy. We also know that entropy is conserved in reversible adiabatic processes and increases in irreversible ones.1 through 8. we can use Eq. Instead it is related to representations of digits. The coefficient k is defined so that the units of entropy are consistent with units of energy and temperature for the thermodynamic relationship T = dU/dS.2)
. The entropy of paper is difficult to determine precisely.
8.5 are based upon descriptive complexity.(8.V ) is the number of such states. This turns out to be closely related to descriptive complexity. The entropy was written as S = k ln (U. however. and volume V.N. It is helpful to review the definitions. Despite the significant conceptual difference between information and entropy.1) where (U.6.

thus there is a smallest distance ∆x within which we do not need to specify one position coordinate of a particle. when the system is in the macrostate specified by U. The particle location is uniquely given once it is within a region ∆x. otherwise the counting of possible microstates of the system would be infinite. then there are 6N coordinates. We thus identify the entropy of a physical system as the amount of information necessary to identify a single microstate from a specified macroscopic ensemble. For an ergodic macroscopic system.2. More correctly. and thus also the amount of information (number of bits) needed in order to describe completely the microstate. like a gas of particles in a box.4).p}|(U. If we consider a mapping of system states onto strings.N.pi}. classically we must specify all of the positions and momenta of the particles {xi . The definition of the entropy takes this into account.N. If there are I({x. Should we include the information necessary to specify the frozen variables as part of the entropy? We would like to separate the discussion of the frozen variables from the fast ones that are in equilibrium.This is the fundamental relationship we are looking for. and a string uniquely identifies a system state. then there is a one-to-one mapping of system states onto the strings. What happens if the system is nonergodic? There are two kinds of nonergodic systems we will discuss: a magnet with a well-defined magnetization below its ordering phase transition (see Section 1. However. If N is the number of particles. this definition is a robust one. quantum mechanics is inherently granular. The same function of the frozen variables we will call C. It does not matter if we consider a typical or an average amount of information.p}.718
Human Civilization I
This expression should be understood as the amount of information contained in a microstate {x. and a glass where there are many frozen coordinates describing the local arrangements of atoms (see Section 1. There is another way to think about the relationship of entropy and information. If we want to describe the microstate of a system. The complete calculation of the entropy (which also takes into account the indistinguishability of the particles) is given in Question 1. To specify exactly the position of each particle appears to require arbitrary precision in these coordinates.V—it is also the information necessary to describe precisely the microstate. 3 position and 3 momentum coordinates for each particle.V )) bits in each string. the strings enumerate or label the system states. We say that a string represents a system microstate. We now recognize that the calculation of the entropy is precisely a calculation of the information necessary to describe the microstate. where h is Planck’s constant.V )) bits is the same as the number of states of the system. The granularity defines the precision necessary to specify the positions and momenta.p}|(U. Many of these coordinates do not change during the time of a typical experiment. it would take an infinite number of binary digits. We review its meaning in terms of the description of a particular idealized physical system.3.
.6). It follows from the recognition that the number of states of a string of I({x. the particle must be located within a region of position and momentum of ∆x∆p = h. We use the entropy S to refer to the fast ensemble—the enumeration of the kinetically accessible states of the system.N. If we had to specify even a single position exactly.

In order to determine which is correct. As long as an experiment is being performed in which the frozen variables never change. then the amount of information in the frozen variables is fixed. One is that we must specify the frozen variables as part of the ensemble.the amount of information that is included in the frozen variables is large. The other calculation we need is the amount of entropy in steam. This can be obtained using a slight modification of the ideal gas calculation. below the magnetization transition only a single binary variable is necessary to specify if the system magnetization is UP or DOWN. the amount of information contained in frozen variables is small. The other is that the frozen variables balance against the fast variables so that when there is more frozen information there is less information in the fast variables. we will need to consider an experiment that measures both. The amount of information associated with this disorder can be calculated directly using a model for the structure of ice that takes into account the correlations between molecular orientations that are needed to form a self-consistent hydrogen structure within the oxygen lattice.6).0002 cal/mole°K.that takes into account the rotational and internal vibrational motion of the water molecule. and thus it does not apply to the glass transition. Is there any relationship between the frozen information and the entropy? If they are related at all. In contrast. For the Ising model of a magnet (Section 1.C o m p l ex it y of p hys ic a l sys t e m s
719
For the magnet.and therefore is generally ignored. Thermodynamic experiments only depend on entropy differences. It is designed for systems like the magnet. The best is C = 0. where this information is insignificant.8145 ± 0. or a change from information in fast variables to information in frozen variables. The simplest way to think about this disorder is that it arises from a choice of orientations of the water molecule around the position of the oxygen atom.heating up a glass until it becomes a liquid or cooling it from a liquid to a glass. The amount of information is insignificant compared to the information in the microstate of a system. There is an intermediate example b etween a magnet and a glass that is of considerable interest. A review of better calculations is given in a book by Fletcher.A different theory is necessary which includes the change from an ergodic to a nonergodic system. and the amount of information necessary to describe the fast variables is just as large as ifthere were no frozen variables. How does this information relate to the thermodynamic treatment of the system? The conventional thermodynamic theory of phase transitions does not consider the existence of frozen information.
. Fortunately. there is another system where we do. In such an experiment the frozen information must be accounted for.A first estimate is based on an average o f 3/2 orientations per molecule or C = Nk ln(3/2) = 0. We will need to consider an experiment that changes the frozen variables—for example. We treat the magnet by giving the information about the magnetization explicitly as part of the ensemble description. The difficulty with a glass is that we do not have an independent way to determine the amount of frozen information. This means that there is a macroscopic amount of information necessary to specify the static structure of ice. there are two intuitive possibilities. The structure of ice has a glasslike frozen disorder of its hydrogen atoms below approximately 100°K.806 cal/moleK. for a glass.

and this is reflected in a decrease in the number of possible states of the system. we see that there can be processes that change the number of fast degrees of freedom and the number of static degrees of freedom while leaving their sum the same. We will consider this further in later sections.1).3.3. The discussion will help demonstrate its validity by using a theoretical argument (Fig.3. When heat is not added to a system. (8. as we go through the freezing-in transition.3).3). to ice) increases the temperature of the system. the
. suggests that the missing entropy was present in the original state of the ice. In this case C decreases and S increases more than would be given by the conventional relationship o f Eq. However.so that fewer variables are frozen.4)
—the total amount of entropy added to the system as it is heated up should be the same as the entropy of the gas.3) q = TdS (8. and there is only one possible state of the system.above the freezing-in transition. 8.(8. This is the amount of entropy in the gas that was not added to the system as it was heated.g. we consider what happens either to ice or to a glass when we cool it down through the transition where degrees of freedom become frozen.82 ± 0.3.3) where q is the heat added to the system.3. Eq. As we cool the system we remove heat. with an ensemble of systems. (8. Adding heat (e.3. experimentally there is a difference of 0. In a theoretical description we start. However. We find the entropy using the standard thermodynamic relationship (Section 1. The coincidence of two numbers—the amount of entropy missing and the calculation of the information in the frozen structure of the hydrogen atoms.5)
This in turn implies that the information in the frozen degrees of freedom was transferred (but conserved) to the fast degrees of freedom. Rather than considering it from the point of view of heating ice till it becomes steam.5) is not consistent with the standard thermodynamic relationship in Eq.6) is important enough to present it again from a different perspective.05 cal/moleK between the two.6) This should be understood as implying that adding heat to a system increases the information either of the fast or frozen variables. (8. Thus we would expect
T
S(T) = q /T
0
∫
(8. We think of this as a shrinking of the number of elements of the ensemble.3. Eq.3.. S(T) = C(T = 0) +
∫0 T
T
q
(8.3.720
Hu ma n Civ il iz a tion I
The key experiment is to measure the change in the entropy of the system as a function of temperature as it is heated from ice all the way to steam. Instead it should be modified to read: q = Td(S + C ) (8. At close to a temperature of zero degrees Kelvin (T = 0K) the entropy is zero because all motion stops.


ensemble breaks up into disjoint pieces that can not make transitions to each other.T2 and T3) the system is ergodic — it explores the entire phase space. Below the glass transition.1 Schematic illustration of the effect on motion in phase space of cooling through a glass transition. The question is whether we should insist on describing the microstate. Typically. the information necessary to specify the particular state within the piece. A particular system explores only one of the pieces. The total amount of information necessary to specify a particular microstate (e. In order to describe a system and its behavior over time. If we insist on describing the microstate of the system. For an incremental decrease in temperature due to an incremental removal of heat. Thus for a particular material we must track only part of the original ensemble. Any particular material must be in one of the disjoint pieces. indicated by the *) is the sum of C/ k ln(2).3. (8. and S/k ln(2). Cooling the system causes the phase space to shrink smoothly. Above the glass transition (T1. is not reflected in the amount of heat that is removed. decreases. This information is given by C/k ln(2). S. the system is no longer ergodic and the phase space breaks up into pieces. we must describe the ensemble it is in. T4. the information needed to identify (describe) a particular microstate is the sum of the information necessary to describe which of the disjoint parts of the ensemble the system is in. we must add the information contained in the fast degrees of freedom S/k ln(2). The reduction of the entropy. The information to specify the ensemble fragment was transferred from the entropy S to the ensemble information C. plus the information needed to specify which of the microstates the system is in once its ensemble fragment has been specified. This is the meaning of Eq.C o m p l e x i t y o f p h y s i c a l s ys t e m s
721
T1 T2 T3 T4
*
Figure 8. the logarithm of the volume of phase space. the
. the information necessary to specify which piece. We are now in a position to give a first definition of complexity.3. The entropy.g.6).

On the other hand. we know in principle how to measure this.2). (8.3.then subtract the entropy added during the heating process. For example. Estimates we will give later imply that complexities of biological organisms are too small to be measured in this way.
Q
uestion 8. we have not identified the number of bits to be used in specifying (U. We have learned from this discussion that for a nonergodic system. If we know that C >> S.V ) and the number of bits necessary to specify the type of element (atom.722
H uma n Ci vi liza t io n I
whole point of describing an ensemble is that we don’t need to specify the particular microstate. the complexity would be given by the (small) number of bits in the specification of the three variables (U. This motivates the introduction of the complexity profile in Section 8. As we have seen in the discussion of algorithmic complexity. We will return to address this question in greater detail later. If our time scale of observation would be arbitrarily long. This requires a careful investigation of units. In order for this technique to work at all. Solution 8. For material systems.3. Degrees of freedom that are frozen on one time scale are not on sufficiently longer ones. This implies that the information in the frozen variables C/k ln(2) is the complexity.A table of fundamental physical constants is given on the following page. The entropy would then be large and the complexity would be negligible.3. then the result is the complexity itself.8)
The information content of a microstate is given by Eq. we heat up the system to the vapor phase where the entropy can be calculated. Use the mass of a helium or neon atom for the mass of the ideal gas particle.V ). The concept of frozen degrees of freedom immediately raises the question of the time scale in which the experiment is performed. for now it is reasonable to consider describing the system to be specifying just the ensemble.1 Calculate the information necessary to specify the microstate of a mole of an ideal gas at T = 0°C and P = 1atm. the complexity (the frozen ensemble information) is bounded by the sum over the numb er of fast and static degrees of freedom (C + S > C).3. The actual amount of information seems not to be precisely defined. we would always describe systems in equilibrium.3.3. However. N. This gives us the value of C + S at the temperature from which the heating began. the complexity must be large enough so that experimental accuracy can enable its measurement. then our complexity would be large and the entropy would be negligible.3 to be: S = kN[ln(V/N (T)3) + 5/2] (T) = (h /2 mkT )
2 1/2
(8. For a thermodynamic system in the microcanonical ensemble. this is to be expected.N.1 The entropy of an ideal gas is found in Section 1.
. since the conventions of how the information is sp ecified are crucial when there is only a small amount.3.7) (8. if our time scale of observation was extremely short so that microscopic motions were detected. As in the case of ice.molecule) that is present.

neither can either of the two parts of the information—the frozen degrees of freedom that we have identified with the complexity. Since we have established a connection between the complexity of physical systems and representations in terms of character strings.3). Take. The minimum amount of information depends on our capabilities of inference from a smaller amount of information. Thus the algorithmic complexity is the same as the information for either part.. for an ensemble that can be described simply. The algorithmic description of an adiabatic process requires only a few pieces of information.3). However. finally. For a nonergodic system like a glass. In this section we address the key word “necessary. (C + S)/k ln(2). Using this explanation. e. The total information.then the amount of information in the microstate has not been changed.g. on average the compression cannot lead to an amount of information significantly different from the entropy (divided by k ln(2)) of the system.Like other aspects of statistical mechanics (Section 1.2. This conclusion follows because the microcanonical (or canonical) ensemble can be concisely described. An irreversible process could achieve a similar expansion. or the additional information necessary to specify a particular microstate. the algorithmic complexity is no different than the Shannon information.724
Hu ma n Civ il iz a tion I
8. but would not be thermodynamically the same. Since the total information cannot be compressed.
. for example. it is in a single microstate.A physical system in e quilibrium is represented by an e nsemble. logical inference and computation lead to the definition of algorithmic complexity. This becomes clearer when we compare adiabatic and irreversible processes. As discussed in Section 8. However. the size of a force applied over a specified distance. represents the selection of a microstate from a simple ensemble (microcanonical or canonical). The specification of this microstate can be compressed by encoding in certain rare cases.this should not be understood as a proof but rather as an explanation of the relationship of the thermodynamic observation to the microscopic properties. we can identify the nature of an adiabatic process as one that is described microscopically by a small amount of information. It is no longer true that the ensemble of dynamically accessible states of a particular system is concisely describable. and the adiabatic process does not change the microstate algorithmic complexity—the entropy of the system. The information in the frozen degrees of freedom is precisely the information necessary to specify the ensemble of dynamically accessible states.Our argument that an adiabatic process does not change the entropy is based on considering the information necessary to describe an adiabatic process—slowly moving a piston to expand the space available to a gas. we can apply these results directly to physical systems.2.2 Algorithmic complexity of physical systems
The complexity of a system is designed to measure the amount of information necessary to describe it. explain the experimental observation that an adiabatic process does not change the entropy of a system (Section 1. If a new microstate of the system can be described by the original microstate plus the process of adiabatic change.the microstate description has been separated into two parts. We can now. At any particular time.” This word suggests that we are after the minimum amount of information.3. or its behavior.

The microscopic correlations cannot be observed on a macroscopic scale.3.
. This conservation of phase space can be understood from our discussion of algorithmic complexity: since the deterministic dynamics of a system can be computed. The conversion occurs when the gas expands to fill the chamber. Where does the additional entropy come from for the final equilibrium state after the expansion? There are two parts to the process of proceeding to a true equilibrium state. the entropy of the system is the same as before. Such dynamics are called conservative. This information is converted to microscopic correlations between atomic positions and momenta. the phase space position evolves in time. Many of the issues related to describing this nonequilibrium process will not be addressed here. Our objective is to develop a consistent language for discussing complexity as a function of length scale. and various currents that follow this expansion become smaller and smaller in extent.1.and for standard observations the system is indistinguishable from an equilibrium state. The additional entropy must come from outside the system. The second part to the process is an actual increase in the entropy of the system. At first there is macroscopically observable information—the particles are in one half of the chamber. which will be discussed later. chamber.
8. the algorithmic complexity of the system is conserved. but the volume of the phase space that is occupied—the entropy—does not change. If we consider an ensemble of systems starting in a particular region of phase space. In the following section we will discuss the complexity as a function of time scale.however. This means that the dynamics of an isolated system conserves the amount of information as well as the energy.3. A key ingredient in our understanding of physical systems is that the time evolution of an isolated system can be obtained from the simple laws of mechanics (classical or quantum).3 Complexity profile
General approach In this section we discuss the relationship of microscopic and macroscopic complexity.4).C o m p l ex it y of p hys ica l sys t e m s
725
the removal of a partition that separates the gas from a second. we must consider the nature of irreversible dynamics. which generalizes the discussion of frozen and fast degrees of freedom in Section 8.One moment after the partition is removed. we are not generally concerned with isolating the system from information transfer.initially empty. In macroscopic physical processes. To understand how the entropy increases. only with isolating the system from energy transfer. The transfer of information from macroscopic to microscopic scale is related to issues of chaos in the dynamics of physical systems.3. In the first part the distinction between the nonequilibrium and equilibrium state is obscured. begin to address the topic of the scale of observation at which correlations appear using the complexity profile in the following section. The removal of a partition in itself does not appear to require a lot of information to describe. We will. Thus we can surmise that the expansion of the gas is followed by an information transfer that enables the entropy to increase to its equilibrium value without changing the energy of the system. The irreversible process of expansion of the gas results in a final state which has a higher entropy (see Question 1.

726

Human Civilization I

When we describe a system, we are not generally interested in a microscopic description of the positions and velocities of all of the particles. For a thermodynamic system there are only a few macroscopic parameters that we use to describe the system. This is indeed the reason we use entropy as a summary of the many hidden parameters of the system that we are not interested in. The microscopic parameters change too fast and over too small distances to matter for our macroscopic measurements/experience. The same is true more generally about systems that are not in equilibrium: a macroscopic description does not require specifying the position of each atom. This implies that we must develop an understanding of complexity that is not tied to the microscopic description, but is relevant to observations at a particular length and time scale. This point lies at the root of a conceptual problem in thinking about the complexity of systems. A gas in equilibrium has a large entropy which is its microscopic complexity. This is counter to our understanding of complex systems. Systems in equilibrium are intuitively simpler than nonequilibrium systems such as a human being. In Section 8.3.1 we started to address this problem by identifying the complexity of a nonergodic system as the information necessary to specify the frozen degrees of freedom. We now discuss a more systematic approach to dealing with macroscopic observations. In order to consider the macroscopic complexity, we have to define what we mean by macroscopic in a formal sense. The concept of macroscopic must be understood in relation to a particular observer. While we often consider experimental results to be independent of the observer, there are various ways in which the observer is essential to the observation. In this context, in which we are concerned with the meaning of macroscopic, considering the observer is essential. How do we characterize the difference between a microscopic and a macroscopic observer? The most crucial difference is that a microscopic observer is able to distinguish between all inherently distinguishable states of the system, while a macroscopic observer cannot. For a macroscopic observer, many microscopically distinct states appear the same. This is related to our understanding of complexity, because the macroscopic observer need only specify which of the macroscopically distinct states the system is in. The microscopic observer must specify which of the microscopically distinct states the system is in. Thus the macroscopic complexity must always be smaller than the microscopic complexity of a system. Instead of considering a unique macroscopic observer, we will consider a sequence of observers with a progressively poorer ability to distinguish microstates. Using these observers, we will define the complexity profile. Ideal gas These ideas can be directly applied to the ideal gas.We generally think about a macroscopic observer as having an inability to distinguish fine-scale distances. Thus we expect that the usual uncertainty in particle position ∆x will increase for a macroscopic observer. However, we learn from quantum mechanics that a unique microstate of the system is defined using an uncertainty in both position and momentum, ∆x∆p = h.Thus for the macroscopic observer to confuse distinct microstates,the product ∆x∆p must be larger than its minimum value—an observation of the system provides measurements of the position and momentum of each particle, whose uncertainty has a ˜ product greater than h. We can label our observers by this uncertainty, which we call h.

C o m p l ex ity of phys ic a l sys t e m s

727

If we retrace our steps to the calculation of the entropy of an ideal gas (Question 1.3.2), we can recognize that essentially the same calculation applies to the ˜ ˜ complexity with the uncertainty h. An observer with the uncertainty h will determine the complexity of the ideal gas according to Eq.(8.3.7) and Eq.(8.3.8), with h replaced ˜ by h. Thus we define the complexity profile for the ideal gas in equilibrium as: ˜ ˜ C(h ) = S − 3kN ln(h /h) ˜ h>h (8.3.17)

This equation describes a complexity that decreases as the ability of the observer to distinguish states decreases. This is as we expected. Despite the weak logarithmic de˜ ˜ pendence on h , C(h) decreases rapidly because the coefficient of the logarithm is so ˜ is about 100 times h the complexity profile has become negative large. By the time h for the ideal gases described in Question 8.3.1. What does a negative complexity mean? It actually means that we have not done the calculation quite right. The counting of states we did for the ideal gas assumed that the par ticles were well separated from each other. If they begin to overlap then we must count the possible states differently. This overlap is significant precisely when Eq.(8.3.17) becomes negative. If the particles really overlapped then quantum statistics b ecomes important; the gas is said to be degenerate and satisfies either FermiDirac or Bose-Einstein statistics. In our case the overlap arises only because the o bserver cannot distinguish different particle positions. In this case, the counting of states is appropriate to a classical ideal gas, as we now explain. ˜ To calculate the complexity as a function of h for an equilibrium state whose entropy is S, we start by calculating the number of microstates that the observer cannot ˜ distinguish. The logarithm of this number of microstates, which we call S(h)/k ln(2), is the amount of information necessary to specify a microstate, if the macrostate is known. Thus we have that: ˜ ˜ C(h ) = S −S(h) (8.3.18) To count the number of microstates that the observer cannot distinguish,we note that the possible microstates of a particular particle are grouped together by the observer ˜ into bins (regions or cells of position and momentum) of size (∆x∆p)d = hd, where d = 3 is the dimensionality of space. The observer determines only that a particle is within a certain region. In the classical ideal gas each particle moves independently, so more than one particle may occupy the same microstate. However, this is unlikely. ˜ As h increases it becomes increasingly likely that there is more than one particle in a region. If the number of particles in a certain region is ni , then the number of distinct microstates of the bin that the observer does not distinguish is: g ni ni! (8.3.19)

˜ where g = (h/h)d is the number of microstates within a region. This is the product of the number of states each particle may be in, corrected for particle indistinguishability. The number of microstates of the whole system that appear to the observer to be the same is the product of such terms for each region:

728

H uma n Ci vi liza t io n I

∏ ni !
i

g ni

(8.3.20)

From this we can determine the complexity of the state determined by the observer as: ˜ ˜ C(h ) = S −S(h) = S −k ln(

∏
i

g ni ) ni!

(8.3.21)

If we consider this expression when g = 1—a microscopic observer—then ni is almost always either zero or one and each term in the product is one (a more exact treatment ˜ requires treating the statistics of a degenerate gas). Then C (h) is S, which means that the microstate complexity is just the entropy. For g > 1 but not too large, ni will still be either zero or one, and we recover Eq. (8.3.17). On the other hand, using this expression it is possible to show that for a large value of g, when the values of ni are significantly larger than one, the complexity goes to zero. We can understand this by recognizing that as g increases, the number of particles in each bin increases and becomes closer to the average number of particles in a bin according to the macroscopic probability distribution. This is the equilibrium macrostate. By our conventions we are measuring the amount of information necessary f or the observer to specify its observation in relation to the equilibrium state. Therefore, when the average number of particles in a bin becomes close enough to this distribution,there is no information that must be given. To write this explicitly, when ni is much larger than one we apply Sterling’s approximation to the factorial in Eq. (8.3.21) to obtain: ˜ C(h ) = S −k ni ln(g /ni )+ 1 = S +k g Pi ln(Pi ) −kN (8.3.22)

∑ (
i

)

∑
i

where Pi = ni /g is the probability a particle is in a particular state according to the ob˜ server. It is shown in Question 8.3.2 that C (h) is zero when Pi is the equilibrium probability for finding a particle in region i (note that i stands for both position and momentum (x,p)). There are additional smaller terms in Sterling’s approximation to the factorial that we have neglected. These terms are generally ignored in calculations of the entropy because they are not proportional to the number of particles. They are, however, relevant to calculations of the complexity: ˜ C(h ) = S −k

∑ ni ( ln(g /ni )+ 1) + k∑
i i

ln( 2 ni )

(8.3.23)

The additional terms are related to fluctuations in the density. This will become apparent when we analyze nonuniform systems below. We will discuss additional examples of the complexity profile below. First we simplify the complexity profile for observers that measure only the positions and not the momenta of particles.

C o m p l ex it y of p hys ica l sys t e m s

729

uestion 8.3.2 Show that Eq.(8.3.22) is zero when Pi is the equilibrium probability of locating a particle in a particular state identified by momentum p and position x. For simplicity assume that all g states in the cell have essentially the same position and momentum. Solution 8.3.2 We calculate an expression for Pi → P(x,p) using Boltzmann probability for a single particle (since all are independent): P(x, p) = NZ −1e − p
2

Q

/ 2mkT

(8.3.24)

where Z is the one particle partition function given by: Z=

∑ e − p / 2mkT = ∫
2

d 3xd 3 p h
3

e −p

2

/2mkT

=

V
3

(8.3.25)

x ,p

We evaluate the expression: −k

∑ g P(x, p)ln(P(x, p))+ kN
i

(8.3.26)

which, by Eq.(8.3.22), we want to show is the same as the entropy. Since all g states in cell i have essentially the same position and momentum, this is equal to:  2 −k P(x, p)ln(P(x, p))+kN =k  ln(V / N 3 ) + p 2 /2mkT   N 3 /V e − p / 2mkT  

∑
x ,p

∑
x ,p

(8.3.27) which is most readily evaluated by recognizing it as:  1 d  3  kN + kNZ −1  ln(V /N 3 )−  Z =kN ln(V /N ) + 5/2 (8.3.28)  d   which is S as given in Eq. (8.3.7).  Position without momentum The use of the scale parameter ∆x∆p in the above discussion should trouble us, because we do not generally consider the momentum uncertainty on the macroscopic scale. The resolution of this problem arises because we have assumed that the system has a known energy or temperature. If we know the temperature then we know the thermal velocity or momentum: ∆p ≈ √mkTi (8.3.29) It does not make sense to have a momentum uncertainty of a particle that is much greater than this. Using ∆x∆p = h this means there is also a natural uncertainty in position which is the thermal wavelength given by Eq. (8.3.8). This is the maximal quantum position uncertainty, unless the observer can distinguish the thermal motion of individual particles. We can now think about a sequence of observers who do not distinguish the momentum of particles (they have a larger uncertainty than the thermal momentum) but have increasing uncertainty in position given by L =∆ x, or g = (L / )d. For such observers the equilibrium momentum probability distribution

22) only in the constant. Algorithmic complexity and error To discuss macroscopic complexity more completely.3.and time-dependent density assumes that the local momentum distribution of the system is consistent with an equilibrium ensemble. for example.t).3.30)
∑ ni (ln(g /n i ) + 5/2)
i
(8. We will consider a couple of different examples of nonequilibrium states to illustrate some properties of the complexity profile. that this description of a space. We consider observers that measure particle positions at different times and from this they may infer the velocity and indirectly the momentum. the determination of velocity depends on the observer’s ability to distinguish moving spatial density variations.730
Human Civilization I
is to be assumed. This is not necessarily true about nonequilibrium systems. Using this discussion we can reformulate our understanding of the complexity profile.3. The complexity of a system. This gives a total entropy of: S(L) =k and the complexity is: C(L) = S − k
∑
i
 ni  ln(Ld /n i
3
)+ 5 /2 
(8. unless we are exactly at a phase transition. Thus far we have considered systems that are in generic states selected from the equilibrium ensemble. we can also describe a rotating disk that has no macroscopic changes in density over time. Before we do this we need to consider the effect of algorithmic compression on the complexity profile. This follows from the content of the previous parag raph.31)
which differs in form from Eq. Equilibrium systems are uniform on all but very microscopic scales. Thus we consider the measurement of n(x.p). (8. ni = n(x. we do measure velocity.5.A macroscopic observer will see these macroscopic variations. The time dependence of observations will be considered in Section 8. We can also describe fluid flow in an incompressible fluid. where x has macroscopic meaning as a granular coordinate that has discrete values separated by L.however. most of the complexity disappears on a scale that is far smaller than typical macroscopic observations. In this case the number of particles in a cell ni contributes a term to the entropy that is equal to the entropy of a gas with this many particles in the volume Ld. This means that patterns that are present in the positions (or momenta) of its particles can be used to simplify the description. we turn to algorithmic complexity as a function of scale. The fraction of the ensemble occupied by these states defined
. In this section we continue to restrict ourselves to the description of observations at a particular time.particularly a nonequilibrium system.should be defined in terms of the algorithmic complexity of its description. We emphasize. Thus. The more fundamental description is given by the distribution of particle positions and momenta. Systems that are in states that are far from equilibrium can have nonuniform densities of particles. Since the observer measures ni . We defined the profile using observers with progressively poorer ability to distinguish microstates. but the rotation is still macroscopic. Thus.3. While we generally do not think about measuring momentum.

Using an algorithmic perspective we say. For conceptual simplicity. Let us label the single particle states using an index that enumerates them. We illustrate this by an example.3. We can define the complexity profile as a function of the number of errors that are made. The algorithmic complexity of this state with particles in odd indexed states is essentially the complexity that we determined above. This is better than using a particular length scale.in every case. Thus.3. Using the indexing of single par ticle states we just introduced.(8. As we mentioned at the end of Section 8.C o m p l e x i t y o f p h y s i c a l s ys t e m s
731
the complexity.there are nonequilibrium states that cannot be distinguished from equilibrium states on a macroscopic scale. This approach is helpful since it suggests how to generalize the complexity profile for systems that have different types of particles. We note that this is also equivalent to defining the complexity profile as the length of the description as the error allowed in the description increases. this is the state that the observer will use to describe the system.the microstate of the system is simpler than an equilibrium state to begin with.33)
where we use the subscript 0 to indicate quantities of the equilibrium state.3. The total error as a function of g for the ideal gas is 1 log( 2
∏ ∆xi ∆pi /h ) = 2 N log(g)
1
(8. An observer with a value of g = 2 cannot distinguish which of two states each particle occupies in the real microstate. The observer cannot tell if a particle is in a black or a white state. These nonequilibrium states have microscopic correlations. The factor of 1/2 arises because the average error is half of the maximum error that could occur. we will continue to write the complexity profile as a function of g or of length scale.2. C(g = 2)— it is the information necessary to specify this state out of all the states that have particles only in odd indexed states. which implies a different error for particles of different mass as indicated by Eq.8).32)
where N is the number of particles in the system. Thus. which cannot be distinguished from the real system by the observer. we can specify the complexity of the system for the observer as the complexity of the simplest state that is consistent with the observations—by Occam’s razor.there is a simpler state where only odd (or only even) indexed states of the particles are occupied. no matter what the real state is. The mi-
. Nonequilibrium states Our next objective is to consider none quilibrium states. Thus. while the macroscopic complexity is the same as in equilibrium: C(g) < C0(g) = S0 C(g) = C0(g) g=1 g >> 1 (8. the microscopic complexity is lower than the equilibrium entropy. We can then imagine a checkerboard (in six dimensions of position and momentum) where odd indexed states are black and even ones are white. equivalently. that the observer cannot distinguish the true state from a state that has a smaller algorithmic complexity. When we have a nonequilibrium state. we take a microstate where all particles are in odd indexed states.3.

and there is no particular relationship between what is going on in one region of length scale L 0 and another. and we used < ni2> = 2.3 this is: C(L0 ) = k V 1 ( (1 + ln(2 ))+ ln ) Ld 2 0 (8. and the apparent entropy S(L) for this system. This is apparent in the case. the complexity of this system for scales of observation g ≥ 2 is the same as that of an equilibrium system—macroscopic observers do not distinguish them. This scenario. this can be evaluated by expanding to second order in ni = ni − n 0 : S(L0 ) = S0 −k
∑
i
( n i )2 kV 2 = S0 − d 2n0 2L 0n 0
(8. which is significantly larger than the microscopic scale but smaller than the size of the system. From Question 8.3.34)
The number of microstates consistent with this macrostate at L 0 is given by the sum of ideal gas entropies in each region: S(L0 ) = −k
∑
i
ni ln(ni /g) +(5/2)kN
(8. what we will show is that the complexity of a nonequilibrium system can be higher than that of the equilibrium system at large scales that are smaller than the size of the system.3. However. of a nonuniform density at large scales. To illustrate what happens for such a nonequilibrium state.36)
where S0 is the entropy of the equilibrium system. This means that ni is smooth on finer scales. We note that when = 0 the logarithmic terms in the complexity reduce to the extra terms
. we consider a system that has nonuniformity that is characteristic of a particular length scale L 0 .3. This is the product of the number of cells V/Ld times the information in a number selected from a Gaussian distribution of width . which have a standard deviation of 0 =√n 0 . which is less than the entropy of the equilibrium system: C(g = 1) = C0(g = 2) < C0(g = 1) However. The values of ni will be taken from a Gaussian distribution around the equilibrium value n0 with a standard deviation of .35)
Since is less than n 0 . For convenience we also assume that is much smaller than n 0 . C(L0) is the amount of information necessary to specify the density values.732
Hu ma n Civ il iz a tion I
crostate complexity is the same as that of an equilibrium state at g = 2. We can calculate both the complexity C(L). for example. It is true that the microscopic complexity must be less than or equal to the entropy of an equilibrium system. We start by calculating them at the scale L 0 . does not always hold. and that all systems have the same complexity when L is the size of the system.3. We assume that is larger than the natural density fluctuations. where the complexity of a nonequilibrium state starts smaller but then quickly becomes equal to the equilibrium state complexity.

(8. Above L 0 it decreases to zero as L continues to increase by virtue of the effect of combining the different ni into fewer regions. By construction. the entropy at the same scale must be reduced S(L) < S0. Since S(L 0) is linear in the number of particles.38)
For > 0 = √n 0 this is greater than one. However. First we see that in order for the macroscopic complexity to be higher than that in equilibrium. a complex macroscopic system must be far from equilibrium.and therefore if the microstate is similar to an equilibrium state. First. Due to this term the complexity crosses that of an equilibrium gas to become larger.and the complexity goes to zero. For L > the complexity profile C(L) decreases like that of an equilibrium ideal gas. C(L 0 ) remains. The term S(L 0) is eliminated at a microscopic length scale larger than but much smaller than L 0. (8.37). The ratio between the two is given by:
2 d /2 S(L) L 1 =− d /2 ln( / C(L) 2n0 L 0 0)
(8. Thus.plus C(L 0). The precise way the complexity goes to zero is not describ ed by Eq.34). We can understand the behavior of the complexity profile of this system.3. and therefore must have a much smaller entropy than an equilibrium system.23).the observer cannot distinguish the two and the macroscopic properties must also be similar to an equilibrium state.3. This is necessary because the sum S(L) + C(L)—the total information necessary to specify a microstate—cannot be greater than S0. However.while 0 C(L 0) is logarithmic in and therefore logarithmic in the number of particles.3. For length scales up to L 0 the complexity is essentially constant and equal to Eq. Second.3. Thus. the complexity and entropy profiles for L > L 0 are: C(L) = k V 1 ( (1 + ln(2 ))+ ln Ld 2 S(L) = S0 − kV
2
(L L )
0
d /2
) (8. the minimum amount of information needed to specify the microstate is C( ) = S(L 0) + C(L 0). This is the sum over the entropy of equilibrium gases with densities ni in volumes Ld .For every bit of information that distinguishes the macrostate. We can understand this result in two ways.3. There are several comments that we can make that are relevant to understanding complexity profiles in general. since the Gaussian distribution does not apply in this limit. there must be many bits of difference in the microstate. we also note that the reduction in S(L) is much larger than the increase in C(L).37)
2(LL 0 )d /2 n 0
This expression continues to be valid until there is only one region left. these terms are the information needed to describe the equilibrium fluctuations in the density. a macroscopic observer makes many errors in determining the microstate. Combining the regions results in a Gaussian distribution with a standard deviation that decreases as the square root of the number of terms → (L0 /L)d /2.(8.C o m p l ex it y of p hys ica l sys t e m s
733
found in Eq.
. we conclude that C(L 0) is much smaller than S(L 0).

2. 1. P(x) = 1 2 e −x
2
/2
2
(8.2):  I = − dxP(x)log(P(x)) = dxP(x) log( 2 
∫
∫
)+ ln(2)x 2 /2
2
 
(8.t). For macroscopic systems this fraction is much larger than the equilibrium fluctuations. like a random walk.3.3 What is the information in a number (character) selected from a Gaussian distribution of standard deviation ?
Solution 8.that in the limit of large gives a Gaussian distribution.37). The expression for the entropy in Eq. 
8. We note that this result is to be interpreted as the information in a discrete distribution of integral values of x.39)
The consequence of this modification is that the complexity decreases somewhat more rapidly as the scale o f observation increases.3.
Q
uestion 8.3 Starting from a Gaussian distribution (Eq. more reasonable.4 Time dependence—chaos and the complexity profile
General approach In describing a system.734
Human Civilization I
In calculating the complexity of the system at a particular scale. (8. However. we write: C(L) = k V 1 Ld/ 2 V L3 d/2 0 0 ( (1 + ln(2 ))+ ln ) ≈ k d ln Ld 2 n 0 (L)Ld /2 L n 0(L0 )L3d / 2 (8. we assumed that the number of particles within each bin was determined exactly.37) is unchanged.3.3.(8. n(x. approach assumes that particle counting is also subject to error.3.39).41)
= log( 2
) + ln(2)/2
where the second term in the integral can be evaluated using < x 2 > = 2. An alternative. Letting m 0(L) be the error in a measurement of particle number. As with the uncertainty in position. It thus makes sense that the information to specify an integer of typical magnitude is essentially log( ).3.40)
we calculate the information (Eq. This is why even the equilibrium density fluctuations were described.Thus the complexity we calculated is the information necessary to specify the number of particles precise to the single particle.a macroscopic observer is not able to distinguish the time of observation within less than a certain time in-
.3. we assumed that the observer was in error in obtaining the position and momentum of each particle. The error in measurement increases as n 0(L) ∝ Ld with the scale of observation. which therefore need not be described. For simplicity we can assume that the error is a fraction of the number of particles counted. This approach also modifies the form of the complexity profile of the nonuniform gas in Eq. 8.3. we are interested in macroscopic observations over time. The units that are used to measure define the precision to which the values of x are to be described.2.

T ) is a monotonic decreasing function of its arguments. t)). The average measurements over space and time represent the system (or system ensemble) that is to be described by the observer. averages over various possible microscopic measurements.t)). so that each particle can be distinguished. in effect.t)) at a scale L we can predict the system behavior.4. This is precisely the origin of the study of chaotic systems discussed in Section 1. This is true even if we are only concerned about predicting the behavior at the scale L. We may need additional smaller-scale information to describe the time evolution of the system.If we averaged the density n(x. we cannot tell if it was perturbed at some time far enough in the past. t)). we know that the
.T (n(x. In this restatement we recognize that the obse rver performs measurements that are. Thus. This must mean that information is not lost over time.t)) over a short period of time (or the simultaneous values of position and momentum) in order to predict the behavior over all subsequent times. A direct analysis is discussed in Question 8. The use of an ensemble is convenient b ecause the observer may only measure one quantity. The different microstates that occur during the time interval T are all part of this ensemble.3.T (n(x. We define the complexity profile C(L. or more generally PL.This may appear different than the definition we used for the spatial uncertainty. a key ingredient in our understanding of physical systems is that the time evolution of an isolated system (or a system whose interactions with its environment are specified) can be obtained from the simple laws of mechanics starting from a complete microscopic description of the position and momenta of the particles. Predictability and chaos As discussed earlier. We describe the past as well as the future from the description of a system at a particular time. Chaotic systems take information from smaller scales and bring it to larger scales.T (n(x. then the information loss—the complexity reduction—also limits the predictability of a system. if we use a small enough L and T. we only need to specify PL. the definitions can be restated in a way that makes them appear equivalent.t) over time.T ) of prediction and the lack of predictability in chaotic dynamics. Since the information on a microscopic scale must be conserved.they do not. Chaotic systems may be contrasted with dissipative systems that take information from larger scales to smaller scales. by considering the effect on C(L. If we perturb (disturb) a dissipative system.C o m p l ex it y of p h y s i c a l s ys t e m s
735
terval T = ∆t. Thus.6. C(L.T (n(x. we say that the system is represented by an ensemble with probability PL.the eff ect disappears over time. Systems that do not lose information over time are called conservative systems. the observer can measure correlations between particle positions that are fixed over time.3.however. However. when we increase the spatial scale of observation. Looking at such a system at a particular time. A description at a finer scale contains all of the information necessary to describe the coarser scale. This representation will be discussed further in Section 8. The ensemble represents all possible measurements with this degree of precision. if we average over the ensemble.these correlations could disappear because of the movement of the whole system. However. To define what this means.T (n(x. For example. but we can consider various quantities that can be measured using the same degree of precision. We are not guaranteed that by knowing PL.1. L. The laws of mechanics are also reversible. However. p. T ) as the amount of information necessary to specify the ensemble PL. We start.

All of the information that affects behavior at a particular length scale. The microstate has a dynamics that is simple.3. nonchaotic and nondissipative system seen by an observer who is able to distinguish 2C(L)/k ln(2) = eC(L)/k states. For a conservative system the amount of information necessary to describe the state at a particular time does not change over time. In this sense we can say that information has been transferred from the macroscopic to the microscopic scale. the flow of information between length scales is bidirectional—even if the total amount of information at a particular scale is preserved. For complex systems.3. However. The dynamics of the simple microstate also describes the dynamics of the macrostate. From the previous paragraph we conclude that all of the interesting (complex) dynamics of a system is provided by in-
. For such systems. the inf ormation may change over time by transfer to or from shorter length scales.information currents remain relevant even though they may be equal and opposite.43) Ct (L.at any time over the duration of the description.T ))/k ln(2) (8. this temporal extent is part of the system definition.T ) at a particular L and T depends also on the duration of the description—the limits of t ∈[t1. For a nonchaotic and nondissipative system we can show that this information is quite small.(8. should be included in the complexity.3.43) is smaller than Eq.the initial conditions) and the dynamics.42) and the complexity is C(L. For a system that is chaotic or dissipative. we cannot describe the past from present information on a particular length scale.42) bits. The dynamics of the system causes the state of the system to change over time among these states. nonchaotic and nondissipative dynamics.since it follows the dynamics of standard physical law.3. We begin by considering a conservative.T ) ≈ C(L).T ) = C(L) + Ct(L.We know from the previous section that the macrostate of the system of complexity C(L) is consistent with a microstate which has the same complexity.g. Therefore Eq. C(L)/k ln(2) is the amount of information necessary to describe the system during a single time interval of length T.. The sequence of states could be described one by one. The degree of predictability is manifest when we consider that the complexity of a system C(L. This holds for a system following conservative. Unlike most theories of currents. This would require NT C(L)/k ln(2) (8. we can also describe the state at a particular time (e. The amount of information to do this is: (C(L) + Ct(L.t 2 by determining the rate at which inf ormation is either gained or lost for a chaotic or stable system. We typically keep these limits constant as we vary T to obtain the complexity profile. which must therefore also be simple. the picture must be modified to accommodate the flow of information between scales.736
Hu man Civ il iz a tion I
information that is lost on the macroscopic scale must be preserved on the microscopic scale. we can also characterize the dependence of the complexity on the time limits t1.T )/k ln(2) is the information needed to describe the dynamics. It is helpful to develop a conceptual image of the flow of information in a system. t2]. Like the spatial extent of the system. However. where NT = (t2 − t1)/T is the number of time intervals.(8.

This follows directly from conseri vation of volumes of phase space in conservative dynamics. This extrapolation is not precise. The number of independent bits of information describing the system above a particular scale is given by the complexity profile. The observer does not see this information before it appears in the state of the system—i. Thus. some of the exponents may be positive and some negative. Two cautionary remarks about the application of Lyaponov exponents to complex physical systems are necessary.47)
As indicated.. the observer can determine the state of the system at the next time. so the observer needs additional information to specify the next location. but he determines x(t − 1) only within a bin of width L.3.the flow of information between scales should be thought of as due to a number of closed loops that extend from a particular lowest scale up to a particular highest scale. In a conventional chaotic system. If we allow ourselves to see the finer-scale information we can track the flow of information that the observer does not see. We can readily see how this affects the information needed by an observer to describe the dynamics. one going to higher scales and one to lower scales. (8.
∑
i
hi = log(
∏
i
∆x i (t )∆pi(t )/
∏
i
∆x i (t −T)∆pi (t −T)) = 0
(8.in the dynamics. The amount of information needed is the lo garithm of the number of bins that one bin expands into during one time step. the sum is only over positive values. Thus. The observer sees the system in state x(t − 1) at time t − 1. As the scale increases.T)+ NT k
i:h i >0
∑h i
(8. while the sum over all exponents is zero.T ) = C(L) + Ct (L. This is precisely h/ ln(2) bits of information. x(t). Unlike many standard models of chaos.3. the complexity of the dynamics for the observer is given by: C(L. However.45) where we have used the same notation as in Eq. These correspond to chaotic and dissipative modes of the dynamics.C o m p l ex it y of p hys ic a l sys t e m s
737
formation that comes from finer scales. L.a complex system does not have the same number of degrees of freedom at every scale. We can imagine the flow of information as consisting of two streams. C(L).the flow of information can be characterized by its Lyaponov exponents. Using the dynamics of the system that is assumed to be known.3.46)
where ∆xi(t) = x′i (t) −xi(t) and ∆pi(t) = p′ (t) −pi(t). The complexity of the system is given by: C(L.e.44) where unprimed and primed coordinates indicate two different trajectories. If the dynamics is conservative then the sum over all the Lyaponov exponents is zero.T) = C(L) +C t (L. For a system that is described by a single real valued parameter. A physical system that has many dimensions.the complexity
.like the microscopic ideal gas.3. will have one Lyaponov exponent for each of 6N dimensions of position and momentum. the Lyaponov exponent is defined as an average over: h = ln((x′(t) − x(t))/(x′(t − 1) − x(t − 1))) (8.T ) + NT kh (8. Consider an observer at a particular scale.43).3.

We should understand this expression as an upper bound on the complexity.8.
Q
uestion 8. so does the maximum number of Lyaponov exponents.3. for the function f (x) = −x log(x). The second remark is that over time the cycling of information between scales may bring the same information back more than once. f (〈x〉) > 〈 f(x)〉.(See Question 8. The state of the system may be selected at random from a particular distribution (ensemble) of states at successive time intervals. they may be stochastic—cycles that do not repeat indefinitely but rather can occur one or more times through the probabilistic selection of successive states.4 we learn that the loss of complexity with time scale occurs as a result of cycles in the dynamics. we can consider various descriptions of the time dependence of the behavior seen by a particular observer. However. we consider a Markov chain model. the less information it contains. c.is itself small.3. These cycles need not be deterministic. where the deterministic dynamics was found to be simple. Thus. Thus.8. Eq.) a.48)
. Long-range correlations that are not easily described by a Markov chain may also be important in the dynamics.4 Consider the information in a Markov chain of NT states at intervals T0 given by the transition matrix P(s′|s). Time scale dependence Once we have chaotic behavior.3. a complex deterministic dynamics can arise if the successive states are specified by information from a smaller scale. b. Solution 8.and therefore may include multiple counting of the same information.as it is in many irreversible processes. From the analysis in Question 8. Assume the complexity of specifying the transition matrix—the complexity of the dynamics —Ct = C(P(s′|s)). then the complexity of the Markov chain is given by: C = C(s) + Ct + NT k ln(2)I(s′|s) (8. Show that the complexity does not decrease for a system that does not allow 2-cycles.5 for the case of a complex deterministic dynamics. The highest complexity would arise from a deterministic dynamics with cycles that are longer than T. More generally. we must also be concerned that C(L) can be time dependent. Show that the more deterministic the chain is. In order to discuss the complexity profile as a function of T.47) does not distinguish this.a high complexity for large T arises when there is a large space of states with low chance of repetition in the dynamics.3.3.3. All of the models we considered in Chapter 1 are applicable. This means that the sum over Lyaponov exponents is itself a function of scale. This is a special case of the more general Markov chain model that is described by a set of transition probabilities. (8. Hint: Use convexity of information as described in Question 1.738
Human Civilization I
decreases.4 When the complexity of the dynamics is small. Show that for an observer at a longer time scale consisting of two time steps (T = 2T0) the information is reduced. This might seem to contradict our previous conclusion.

we note from Eq.3. Thus.s2 }) =  1 2  P(s1 |s1 )P(s1 ) s 2 = s1 (8. s} for a pair of states. To show it from the equations. and starting from s 2 to make the pair. The proof of (a) follows from realizing that the more deterministic the system is.the smaller is I(s′|s). To analyze the complexity of the Markov chain for an observer at time scale 2T0.s 2 })= [ (P(s1 | s ′ )P(s 2 |s 1 )+ P(s ′ | s1 )P(s1 |s1 ))P(s 1 |s 2 )P(s 2 ) ′ ′ ′ 2 ′ ′ 2 ′ + ( P(s1 | s 2 )P(s 2 |s 2 ) + P(s2 | s1 )P(s 1 |s 2 )) P(s 2 |s1 )P(s 1) ]/P({s1 . The transition probability can be seen to be an average over two terms in the round parenthesis. s2} has 1 2 already occurred.3. we need to combine successive system states into an unordered pair—the ensemble of states seen by the observer.50) includes all four possible ways of generating the sequence of the two pairs. By convexity.(8. Eq. This may be used to define how deterministic the dynamics is.50) which is valid only for s1 ≠ s 2 and for s′ ≠ s′ . The latter is the new transition matrix. This is apparent since the observer loses the information about the state order within each pair. To show (b) we must prove that the process of combining the states into pairs reduces the information necessary to describe the chain.3. The transition matrix for pairs is given by P({s1 . the information is smaller.(8.49).5). assuming the pair {s1 . The probability of a particular pair is: P(s |s )P(s 2 ) + P(s 2 |s 1 )P(s1 ) s 2 ≠ s1 P({s1 .3.3. Since the probabilities are larger.49) that the probability of a particular pair is larger than or equal to the probability of each of the two possible unordered pairs. We must show the same result for each successive pair. we are considering a new Markov chain of transitions between unordered pairs. the information in the average is less than the average information of each term. s2 } |{s1 .then the chain is more concisely described by specifying each of the states of the system (see Question 8.3. To analyze this we need two probabilities:the probability of a pair and the transition probability from one pair to the next.47) should be apparent.C o m p l ex ity of phys ic a l sys t e m s
739
where the terms correspond to the information in the initial state of the system. Other cases are treated like 1 2 Eq.(8.the information in the dynamics and the incremental information per update needed to specify the next state.3.49)
where P(s) is the probability of a particular state of the system and the two terms in the upper line correspond to the probability of starting from s1 to make the pair. Thus the information contained in the first pair is smaller for T = 2 than for T = 1. because ifit is larger than NT C(s). The relationship between this and Eq.s 2}) ′ ′ ′ ′ ′ ′ (8. The normalization is needed because the transition matrix is the probability of {s′ ≠ s′ } occurring.(8. This expression does not hold if Ct is large. We use the notation {s′.Each of the terms is a sum
.

the information needed to specify any pair in the chain is smaller than the corresponding information in the chain of states. when we consider the description of a system over
Q
. Contrast this with the maximum complexity of describing NT steps of this system. the motion is determined by information contained on a smaller length scale just prior to its occurrence. as long as this is smaller than the previous result.This generally involves the transfer of information between a larger scale and a smaller scale. From the equations we see that if only one of P(s1|s 2) and P(s 2|s1) can be nonzero. Finally.3.  uestion 8.740
Hu ma n Civ il iz a tion I
over the probabilities of two possible orderings. even at a length scale larger than the system itself. However. for T = 2 the complexity is retained if the dynamics is not reversible—there are no 2-cycles. This would require 2C(L) /k ln(2)C(L) /k ln(2) bits. so that each successor must be identified out of all possible states.A description of the system behavior. Solution 8. to prove (c) we note that the less the order of states is lost when we combine states into pairs. Each of these must be assigned a successor by the dynamics. the large-scale motion would be changed by modifications of the internal state of the system. and is therefore larger than or equal to the probability of either ordering. between observed phenomena and their representation in the synapses of the nervous system. However. must describe this motion. If transitions in the dynamics can only occur in one direction. (8. The maximum complexity of NT steps is just NT C(L).5 The number of possible states of the system is 2C(L) /k ln(2). The maximum possible information to specify the dynamics arises if there is no algorithm that can specify the successor.the complexity of the system at its own scale or larger is zero—or a constant if we include the description of the equilibrium system.3.49) and 1 2 2 ′ Eq.3. Another example of information t ransfer between different scales is related to adaptability. For arbitrary T the complexity is the same as at T = 1 if the dynamics does not allow loops of size less than or equal to T. Thus.  A simple example of chaotic behavior that is relevant to complex systems is that of a mobile system—an animal or human being—where the motion is internally directed. This is consistent with the sensitivity of chaotic motion to smaller scale changes. Thus.5 Calculate the maximum information that might in principle be necessary to specify completely a deterministic dynamics of a system whose complexity at any time is C(L). then only one term survives in Eq.Which is generally a reasonable assumption. and similarly for P(s′ |s′ ) and P(s′ |s 1).Specifically. This satisfies the formal requirements for chaotic behavior regardless of the specifics of the motion involved.then we can infer the order and information is not lost. (8. which requires that information about the external environment be represented in the organism.3.50) and no averaging is performed. the more complexity is retained. When we describe a system at a particular moment of time. Stated differently.

in the next section we discuss several aspects of the complexity profile. For example. Before we address this question.C o m p l ex it y of p h ys i c a l s ys t e m s
741
time. smaller scale motions are not observed.and a simpler description of motion is possible. Fig.6 on behavioral complexity. Similarly. How do we distinguish this from a system that moves due to its own actions? More generally.2 illustrates the complexity profile for a few systems.T ) + S(L. but rather because of the volition of the players. For any system. The paragraphs that follow describe some of their features. it is the motion of the system as measured by its location at successive time intervals that is to be described. 8.a particle moving in a fluid may be displaced by the motion of the fluid.T ) and the entropy S(L. This question will be dealt with in Section 8.3. For a typical system in equilibrium. this is valid only under special circumstances—when the macroscopic state is selected at random from the ensemble of macrostates.5 Properties of complexity profiles of systems and components
General properties We can readily understand some of the properties that we would expect to find in complexity profiles of systems that are difficult to calculate directly. however. We have thus also defined the entropy profile S(L. Specifically.3. T the sum of the complexity C(L. as L.3. then the complexity is larger due to the system motion.51) However. As the scale becomes larger. T is the microscopic complexity—the amount of information necessary to describe a particular microstate. we must ask how we must deal with the environmental influences for a system that is not isolated. For an equilibrium state this is the same as the thermodynamic entropy.A glass may satisfy this requirement. A natural question that can be asked in this context is whether the motion of the system is due to external influences or due to the system itself. the density of the system is
. This should be considered different from a mobile bacteria. but do not affect the apparent macroscopic entropy. Increasing the scale of observation continues to result in a progressive decrease in complexity.
8.0) ≈ S(∞. a basketball in a game moves through its trajectory not because of its own volition.This is not true in general because short-range correlations decrease the microstate complexity.T ) of the system (the fast degrees of freedom) should add up to the microscopic complexity or macroscopic entropy C(0.T ) (8. including the relationship of the complexity of the whole to the complexity of its parts. The observer only notes changes in position that are larger than the scale of observation. other complex systems need not.3.and the microstate is selected at random from the possible microstates.1 we might also conclude that at any scale L. At a scale that is larger than the system itself. From our discussion of nonergodic systems in Section 8.T) as the amount of information necessary to determine an arbitrary microstate consistent with the observed macrostate.3.∞) ≈ C(L. which is the entropy of a system observed on an arbitrarily long time scale.T is increased the system rapidly becomes homogeneous in space and time. the complexity at the smallest values of L.

T ) of four different systems. Beyond the correlation length. However. a plateau in the complexity profile extends up to characteristic time scales of human observation. Unlike C(0. (4) A complex biological organism has a complexity profile that should follow similar behavior to that of a fractal. Thus. The frozen degrees of freedom that make it a nonergodic system at typical time scales of observation guarantee this.T ) decays rapidly at first due to averaging over atomic vibrations.T) is the amount of information necessary to describe the system ensemble as a function of the length scale. C(L.aside from unobservable small fluctuations. the complexity is just the macroscopic complexity associated with thermodynamic quantities (U.N. Once the length or time scale is beyond the correlation length or correlation time respectively. However. which vanishes on any reasonable scale.0) of a glass decays like a thermodynamic system because it is homogeneous in space.T).the average behavior characteristic of the macroscopic scale is all that remains. We can contrast the complexity profile of a thermodynamic system with what we expect from various complex systems. the plateau should be relatively flat and end abruptly. V ).At a temperaturedependent and much longer time scale. bottom panel shows the length scale dependence. 
uniform in space and time. T. once the scale of observation is larger than either the correlation length or the correlation time of the system. A typical glass is uniform if L is larger than a microscopic correlation length.2 Schematic plots of the complexity profile C(L.3.At typical values of T the temporal ensemble of the system includes the states that are reached by vibrational modes of the system. L. For a glass. However it has plateau-like regions that correspond to crossing the scale of internal components. This time scale. but not the atomic rearrangements characteristic of fluid motion. the relaxation time. the atomic vibrations cannot be observed except at microscopic values of T. Thus. it is different as a function of T. the complexity profile of the glass is similar to an equilibrium system as a function of L. C(L. the complexity profile is quite different in time and in space. and time scale. (2) For a glass the complexity profile as a function of time scale C(0. such as molecules and cells. For lower temperatures it is not. A more extended spatial coupling would give rise to a grading of the plateau and a broadening of the time scale at which the plateau disappears. (3) A magnet at a second-order phase transition has a complexity profile that follows power-law behavior in both length and time scale. Top panel shows the time scale dependence.C o m p l e x i t y o f p h y s i c a l s ys t e m s
743
Figure 8.
. Because the glass is uniform in space. Stochastic fractals capture this kind of behavior. Indeed. At much longer time scales the complexity profile decays to its thermodynamic limit. this might be taken to be the definition of the correlation length and time—the scale at which the microscopic information becomes irrelevant to the properties of the system. (1) An equilibrium system has a complexity profile that is sharply peaked at T = 0 and L = 0. This is because spatial uniformity indicates that the relaxation time is essentially a local property with a narrow distribution. it then reaches a plateau that represents the frozen degrees of freedom. Correspondingly. a significant part of the microscopic description remains necessary at longer time scales. the complexity profile declines to its thermodynamic limit.and the complexity profile is constant at all length and time scales less than the size of the system. of observation. is accessible near the glass transition temperature.

but not of the internal behavior of the cell.and its algorithmic as well as ensemble complexity will scale as a power law of the scale of observation L. by including random choices in the algorithm. As L increases. Mathematical models that best capture the complexity profile of a complex system are fractals (see Section 1. They may also be systems representing the spatial structure of various stochastic processes.Larger atomic motions or molecular behavior will be averaged out on a second. The algorithm describes how to create finer and finer scale detail.because over a range of length scales larger than the cell. Describing the connection between the microscopic parameters and macroscopically relevant parameters has occupied our attention in much of this book. It is the degrees of freedom that remain relevant on the longest length scale that are key to the complexity of the system. they must be traceable back to the microscopic degrees of freedom. corresponding to the atomic length scale of a physical system. but is very small on all length scales. Ultimately. Plateaus in the profile are related to the existence of well-defined levels of description. Mathematical fractals with no granularity (no smallest length scale) have infinite complexity. at least up to some fraction of the spatial and temporal scale of the system itself.and thus their algorithmic complexity is small. Stochastic fractals are qualitatively different. Such a system requires information to describe its structure on every length scale. the
.larger scale. In biological organisms. For example. These degrees of freedom manifest the concept of emergent collective behavior. the complexity profile should not be expected to fall smoothly. The only difficulty in specifying the fractal is specifying the number of levels to which the algorithm should be iterated. because different cell populations have different sizes and some cells are mobile. an identifiable level of cellular behavior would correspond to a plateau. the internal behavior of tissues and organs will be averaged out on a still longer length and time scale. The internal cellular behavior will then be averaged out. Finally. Thus. This information (the number of iterations) requires a parameter whose length grows logarithmically with the ratio of the size of the system to the smallest length scale. A stochastic fractal is a member of an ensemble.At the shortest time scale the atomic vibrations will be averaged out to end the first plateau. the sharpness of the transition should be smoothed.Starting from the microscopic complexity. We can at least qualitatively identify several different plateaus. There are many cells that have a characteristic size and are immobile. Examples are the Kantor set or the Sierpinski gasket. The deterministic fractals are specified by an algorithm with only a few parameters. and we define a longest length scale that is the size of the system. deterministic and stochastic fr actals. if we define a smallest length scale. for a complex system we expect that many parameters will be required to describe its properties at all length and time scales.744
Human Civilization I
More generally. then we can plot the spatial complexity profile of a fractal-like system. There are two quite distinct kinds of mathematical fractals. However.Stochastic fractals can be based upon the Kantor set or Sierpinski gasket. there are random choices made at every scale of the structure. a full accounting of cellular properties.there will be particular length scales at which details will be lost. must be given. However. In such fractals.10). we can expect that as we increase the scale of observation. a deterministic fractal has a complexity profile that decreases logarithmically with observation length scale L.

Ls . this results in a loss of complexity. we can write the complexity of a system s as Cs = Cs(Ls) = Cs(Ls . The details of behavior must be lost as we observe on longer length and time scales. there are further limitations. we consider a significant fraction of the system—onetenth of its size. Thus it is unlikely that atoms can be attached to each other in such a way that the behavior of each atom is relevant to the spatiotemporal behavior of an organism at the length and time scale relevant to a human being. This form makes the increase in time scale for larger length scales (systems) apparent.0) (8. This scaling is concretized by the complexity profile. This strengthens the identification of the fractal model of space and time as a central model for the understanding of complex systems. Ts .One of the central questions in the field of complex systems is understanding how complexity scales. For the spatial scale. This is essential ly the maximal complexity for this length scale. but not in its spatial behavior.C o m p l e x i t y o f p h y s i c a l s ys t e m s
745
amount of information is reduced. there is a natural space and time scale at which to define it.52)
In Section 1.We see that the glass is complex in its temporal behavior. we ask what is the maximum possible complexity at that scale. it seems intuitive that a complex system is complex on many scales. we consider the relaxation (autocorrelation) time of the behavior on this same length scale. but there is no length scale smaller than the size of the system at which it is completely lost. Observers with larg er values of L can see the behavior of the correlations only on the longer length scales.3. Given a particular length or time scale. which would be the same as setting T = 0. The complexity profile enables us to consider again the definition of a complex system. A renormalization treatment. However.since it is dependent on the space scale. As we stated. At this transition there are fluctuations on all spatial and temporal scales that have power-law behavior in both. The complexity scaling of complex organisms should follow a line like that given in Fig. The simplest physical model that demonstrates such fractal properties in space and time is an Ising model at its second-order transition point.One of the objectives is to understand the ultimate limits to complexity. Leaving out the time scale. The highest complexity of an organism results
. 8. We have also gained an understanding of the difference between deterministic and stochastic fractal systems.10 we discussed generally the scaling of quantities as a function of the precision to which we describe the system. we could also take a natural time scale of Ts = Ls / vs where vs is a characteristic velocity of the system. can give the value of the complexity profile.2. however. Time series that have fractal behavior—that have power-law correlations—would also display a power-law dependence of their complexity profile as a function of T.One could say that this complexity is limited by the thermodynamic entropy.3.Ls /v s ) ≈ Cs(Ls .10. while leaving collective information that remains relevant at the longer scales. discussed in Section 1. and therefore is only a partial example of a complex system. For the temporal scale. These examples illustrate how microscopic information may become ir relevant on larger length scales. If we want to identify a unique complexity of a system. These limitations are established by the nature of physical law that establishes the dynamics and interactions of the components.

53) where N is the numb er of sheep in the flock. Here we can provide additional meaning to this statement using the complexity profile.4. For our current purposes this might be a lot of information contained in a large number of books. We begin by taking information that describes each of the sheep. Much of the information that describes one sheep can also be used to describe other sheep. or a little information contained in a single paragraph of text. one book or 107 bits. flocking. we see that the complexity of a flock may actually be smaller than the complexity of one sheep. Let us assume that we know the complexity of a sheep. we will obtain an estimate of the complexity as.Later. Thus we write that: Cflock = Cflock(Lflock) << Cflock(Lsheep) << NCsheep(Lsheep) = NCsheep (8. relevant to various questions about the complexity profile is an understanding of the complexity that may arise when we bring together complex systems to form a larger complex system. The example is chosen to expand our view toward more general application of these ideas. In general it is not clear that bringing together many complex systems must give rise to a collective complex system. or we can describe general characteristics of sheep and then specialize them for each of the individual sheep. where one example was a flock of animals. Among other conclusions. walking. At this time we do not know what limits can be placed on the rate of decrease of complexity with scale. we can consider a flock of sheep. grazing activity. reproductive rates. in Section 8. etc. a description of the flock will be shorter than the sum of the lengths of the descriptions of each of the sheep.however.3. having described one sheep in detail we can describe the differences.746
Hu ma n Ci vi liza t io n I
from the retention of the greatest significance of details. Of course there are differences in size and in behavior. the relationship between the complexity of the collective complex system and the complexity of component systems is crucially dependent on the existence of coherence and correlations in the behavior of the components that can arise either from common origins for the behavior or from interactions between the
. This was discussed in Chapter 6. Thus. We now consider a flock of N sheep and construct a description of this flock. Components and systems As we discussed in Chapter 2. Combining these descriptions. This information is. The general statements we make apply to any system formed out of subsystems. highly redundant. We will discuss the relationship of the complexity of components to the complexity of the system they are part of. which is much larger than Lsheep . More generally. The description of the flock behavior has to be on its own length scale Lflock . We describe the flock behavior in terms of sheep density. of order. the amount of information necessary to describe the relevant behaviors of eating. So we shift our observation of behavior to this longer length scale and find that most of the details of the individual sheep behavior have become irrelevant to the description of the flock. However. Using this strategy. reproducing. we have a description of the flock. this is not what we really want. This is in contrast to thermodynamic systems. migration. Still.at a length scale of about one-tenth the size of the sheep. etc. To be definite. where all of the degrees of freedom average out on a very short length and time scale. Csheep(Lsheep).. a complex system is formed out of a hierarchy of interdependent subsystems.

though this is not necessary for the analysis.(8. The she ep is moving in a random walk where each step has a length qL0 and takes a time qT0. The characteristic time over which a sheep moves a distance L 0 is T0. For L = L 0 and T = T0. as well as the effect of coherent motion of an individual sheep over time.3. (8. so that we do not describe its internal structure—we describe only its location at successive intervals of time. The distance traveled is proportional to the square root of the time. The first inequality arises because we change the scale of observation and so lose the behavior of an individual sheep. This increase of the standard deviation causes an increase in the value of the complexity for all scales greater than L1. This is the same model as the previous one. In this case their behavior is coherent.37).then the complexity of describing all of them is much smaller than the sum of the separate complexities.Specifically. we can reconstruct the motion from the measurements of any observer with L < qL 0 and T < qT0.3. We will use this model to calculate the complexity profile of an individual sheep. Thus the complexity is: C(L.T ) = 2NT /q L < qL 0. and the changes in direction are at well-defined intervals. 8. We can discuss this more quantitatively by considering the example of the nonuniform ideal gas. the complexity of describing the motion is exactly 2 bits for every q steps to determine which of the four possible directions the sheep will move next. Thus. For simplicity we can assume that the direction chosen is one of the four compass directions. we consider placing the same densities in a region of scale L1 > L 0 . If the behaviors of the sheep are independent. A comparative plot is given in Fig.3. To do this we assume that an individual sheep moves in a straight line for a distance qL0 in a time qT0 before choosing a new direction to move in at random. but now on a length scale of L1. only correlated motions of many sheep can be observed on a longer scale. The loss of information for uncorrelated quantities due to combining them together is described by Eq.54)
Once the scale of observation is greater than qL0.3. However.if their behaviors are correlated. We can come closer to considering the behavior of a collection of animals by considering a model for their motion.We start with a scale L 0 just larger than the animal. T < qT0 (8. having a large collective complexity requires a balance between d ependence and independence of the behavior of the components. We first describe this qualitatively by considering the two inequalities in Eq. Thus. To construct a model where the quantities are correlated. the observer does not see every change in direction.3. We will use a model for sheep motion that can illustrate the effect of coherence of many sheep. The new value of is 1 = (L1 /L 0)d.53).and so the sheep moves a dis-
.On the other hand.C o m p l ex it y of p hys ica l sys t e m s
747
components. since there is no structure below this scale. Because the movement is in a straight line. for L < L1 the complexity is just the complexity at L1. The second inequality arises because different sheep have the same behavior.then their behavior cannot be observed on the longer scale. but the observer does not see each step. Our treatment only describes the leading behavior of the complexity profile and not various corrections. the movement of one sheep to the right is canceled by another sheep that starts at its right and moves to the left. There is a trade-off between these two inequalities.

2
C(L)
1 0.  0
tance L once in every ( 0 /L)2 steps.3.8
(1)
0. (8.54) and Eq.3 Plot of the complexity of a nonuniform gas (Eq. and the horizontal axis is in units of L3 measured in units of L3. but short length scales L < qL 0 . The standard deviation of this random walk is proportional to the square root of the number of steps:
. (8.37)). and thus we have a complexity: C(L.3.T) = 2
2 NT 0 qL2 = 2N T 20 q L2 L
L > qL0 .3. which itself undergoes a random walk. However.3.3. The second case has a lower complexity at smaller scales but a higher complexity at the larger scales.(8. we use a simplified “blob” picture to combine the successive positions of the sheep into an ensemble of positions. At this point the movement of the sheep will be described by the movement of the blob.3. where 0 = qL 0 is the standard deviation of the random walk in each dimension. For T only a few times qT0 we can expect that the ensemble would enable us to reconstruct the motion—the complexity is the same as Eq. The first (1) has a correlation in its nonuniformity at a scale L0 and the second (2) at a scale L1 > L0. eventually the ensemble of positions will overlap and form a blob.2
(2)
0
5
10
Ld
15
20
Figure 8. The magnitude of the local deviations in the density are the same in the two cases. for two cases. Eq (8. to show the effects on a linear 3 scale L1 was taken to be only √10L0.748
Hu ma n Civ il iza t io n I
1.55) are equal. Because the complexity decreases rapidly with scale. Every time the sheep travels a distance L we need 2 bits to describe its motion. To obtain the complexity profile for long times scales T > qT0.39) would give similar results but the complexity would decay still more rapidly. T <qT 0
(8.6 0.3. (8.4 0.54).55)
We note that at L = qL0 Eq.

We can see from our results (Fig. However. However. This minor adjustment enables the complexity to be continuous despite our rough approximations. as in our discussions of polymers in Section 5. This is a straightforward consequence of increasing the coherence of the motion over time. Increasing q decreases the complexity at the scale of a sheep.54). We now see the significance of increasing q. We also see that the complexity at long times decays inversely proportional to the time but is relatively insensitive to q.(8. the blob behavior only occurs for T significantly greater than qT0.3.54) and the blob b ehavior. If the movement of all of the sheep is coherent. then the complexity of the flock for length scales greater than the size of the flock is the same as the complexity of a sheep for the same length scales. If the movement of sheep are independent of each other. the amount of information is essentially that of selecting a value from a Gaussian distribution of this standard deviation: C(L. 0 (1 + log( )) q T L qT )) T0
N qT L = 2 T min(1. C(L. We might be concerned that the flock will disperse.3. 0 (1 + log( 0 q T L
L < .2.(8.3. Since this is larger than L. Thus the limit on Eq. T > qT0
(8. we use the minimum of two values to sele ct the crossover point between the b ehavior in Eq.3.the complexity is the same.57)
. (8.55). interactions that would keep the sheep together need not affect the motion of their center of mass. As we mentioned above. and does not change any of the conclusions.56)
There are a few points to be made about this expression. We now use two different assumptions to calculate the complexity of the flock. First.T) = 2 NT qT min(1.T) ∝ q in Eq.4) how varying q affects the complexity.T) ∝ 1/q in Eq. (8. Thus we can increase the complexity of the whole at the cost of reducing the complexity of the components.3.21). with its center at the center of mass.3.3. which describes the behavior when L becomes large.T) = 2N T qL2 0 NL2 L> (8.C o m p l ex ity of phys ic a l sys t e m s
749
= 0√T/qT0. In either case. The value of q primarily affects the crossover point to the long time behavior.2. then the flock displacements—the displacements of its center of mass—are of characteristic size /√N (see Eq.55). The simplest way to identify the crossover point is when the new estimate of the complexity becomes lower than our previous value.(8. 8. the complexity of flock motion (L > L1) is obtained as: C(L. Since the motion of the sheep with this boundary does not require additional information over that without it.5. We could also introduce into our model a circular reflecting boundary (a moving pen) around the flock.55) should be generalized to L > . The second point is that we have chosen to adjust the constant term added to the logarithm so that when L = the complexity matches that given by Eq. This is apparent because describing the movement of a single sheep is the same as describing the entire flock.3. it increases the complexity at longer scales C(L. Increasing q increases the flock complexity until qL0 reaches L1. where L1 is the size of the flock.

increases the complexity of the flock. the maximum complexity of the flock is just that of an individual sheep. because our study of complex systems is focused upon systems whose complexity is larger than their components. Thus even for coherent motion. Increasing the distance a sheep moves in a straight line (coherence of motion in time). If we choose T to be very large. To obtain a higher complexity of the whole we must modify this model.T=1
0. we see that even with coherent motion the complexity of a flock at its scale cannot be larger than the complexity of the she ep at its own scale. However.T=500
0. This is a problem for us. decreases the complexity at small length scales and increases the complexity at large length scales.3.3. increasing q increases the flock complexity. and this arises only for coherent behavior when all movements are visible on the scale of the flock. Solid lines and dashed lines show the complexity profile as a function of length scale for a time scale T = 1 and T = 500 respectively. or coherence between different sheep.8
q=50.T=1
0. We must assume
.the flock complexity is much lower than before—it decreases inversely with the number of sheep when L > . Eq. 
This is valid for all L if is less than L1. there would be no complex systems.4
q=100.6
q=100. We see that when the motion of sheep are independent.4 The complexity profile is plotted for a model of the movement of sheep as part of a flock. However. This example illustrates the effect of coherent behavior.2
50
100
L
150
200
Figure 8. Any movements of an individual sheep that are smaller than the scale of the flock disappear on the scale of the flock. (8.T=500
0. Thus coherence in the behavior of a single sheep in time.750
Human Civilization I
1
C(L)
q=50. with replaced by /√N. Without this possibility. in general the flock complexity is smaller than the complexity of a sheep.56) applies. Even in this case. q. however.

In order to achieve such patterns.3. the motions of the individual sheep must be neither independent nor coherent—they must be correlated motions that combine patterns of sheep motion into the more complex patterns of flock motion. Thus. To increase the complexity.
Solution 8. The information that describes the system behavior must be relevant on every smaller length scale. If the system behavior is complex. the information necessary to describe the system behavior is determined by the microscopic description of atomic positions and motions. 
Q
uestion 8. Explain.3. we see that the dependence of an animal on air is simple. T (n(x.C o m p l ex it y o f p h y s i c a l s ys t e m s
751
more generally that the motion of a sheep is describable using a set of patterns of behavior.t)) to the domain of the part. Thus.6 Throughout much of this book our working definition of complex systems or complex organisms as articulated in Section 1. we have a direct relationship between the definition of a complex system in terms of parts and the definition in terms of information. How is this definition related to the definition of complexity articulated in this section?
Solution 8. Coherent motion of sheep still lead to a similar (or lower) complexity. the greater is the system complexity. the motion o f the flock must have more complex patterns of motion. which have not been included here.6 Our quantitative concept of complexity is a measure of the information necessary to describe the system behavior on its own length scale. This is possible only if there are interactions between them.3 and developed further in Chapter 2 was that a complex system has a behavior that is dependent on all of its parts.
Q
uestion 8. Ultimately. where the parts of the system are manifest because we can distinguish the description of one part from another.3. The behavior of a system is thus related to the behavior of the parts. since the necessary properties of air are simple to describe.7 We can now recognize that the use of information as a characterization of behavior enables us to distinguish various forms of dependency. the degree of interdependence of two syst ems should be measured as the amount of information necessary to replace one in the description of the other.7 When we defined interdependence we did not consider the dependence of an animal on air as a relevant example. It should now be clear that the objective of learning how the complexity of a system is related to the complexity of its components is central to our study of complex systems. To do this we limit PL. 
. In particular. The more these are relevant to the system behavior. the more its behavior depends on smaller scale components.3. These parameters are related to the description of the system on a smaller length scale. that it is impossible to take part of a complex organism away without affecting the behavior of the whole and behavior of the part. In particular.then it must require many parameters to describe. The more complex a system is.

along with their probabilities. we obtained a mechanism for distinguishing complex systems from equilibrium systems.or all system states. then the information in the particular message is the same as the average information. and the second was the average information provided by a particular source. In Shannon’s approach to the study of information in communication systems. It is a quantity that characterizes the ensemble rather than the individual microstate. and we can write: I({x. The use of system descriptions to define system complexity does not directly take this into account. the most essential information is to give a description of the kinds of messages that will be received—the ensemble of possible messages. which was not included previously.5.N . p}))
(8.V )) = −logP({x.3. A second delimiter (. we return to the underpinning of descriptive complexity and present the concept of behavioral complexity. The first was the information content of an individual message. if we want to describe the source to someone. has a different purpose. there were two quantities of fundamental interest. However. have the same probability. The discussion of algorithmic complexity was based on a consideration of the information provided by a particular message—specifically. It also enables us to consider directly the interaction of the system with its environment.and a systematic method for characterizing the complexity of a system.58)
The expression on the right. How much information do we need to describe these probabilities? We call this the behavioral complexity of the source. We assume that the language consists of a list of characters or messages that can be received from the source. Any description of a source must assume a language that is to be used. To introduce the new approach. however.0:0. how much it could be compressed.3.6 Behavioral complexity
Our ability to describe a system arises from measurements or observations of its behavior. For convenience.) is used to separate different members of the list. It is a characterization of the source rather than of any particular message. we will write probabilities in decimal notation. There is another approach to reaching the complexity profile that incorporates the observer and system relationship in a more satisfactory manner.A delimiter (:) is used to separate the messages from their probability. We can pursue this line of reasoning by considering more carefully how we might characterize the source of the information. It is convenient to include the length of a message in our
. p}|(U . p}) = −
{x. When all messages.One way to characterize the source is to determine the average amount of information in a message. This carried over into our discussion of physical systems when we introduced the microscopic complexity of a system as the information contained in a particular microscopic realization of the system. Thus to characterize the source we need a description of the probability of each kind of message. By acknowledging the scale of observation. A few examples in the context of a source of messages will serve to illustrate this concept.A source that gives zeros and ones at random with equal probability would be described by {1:0. p})log(P({x.p }
∑ P({x.5}. rather than the messages. The complexity profile brought us closer by acknowledging the observer in the space and time scale of the description.752
H uma n Ci vi liza t io n I
8.

5. and specifying each probability with Q bits would give us a total of (N + Q)2N bits to describe the source. Thus we might describe a source with length N = 1000 character messages. We see that the behavioral complexity is quite distinct from the complexity of the messages provided by a source.this precision is related to the number of messages that might be received. the complexity of an arbitrary source of messages of a particular length is much larger than the complexity of the messages it sends. Behavioral complexity becomes particularly useful when it is smaller than the complexity of a message. In particular in the above example it can be larger. To overcome this problem. each character zero and one with equal probability. However.5)}. The description of the source can also be compressed using the principles of algorithmic complexity. This definition of the behavioral complexity of a source runs into a minor problem. the behavioral complexity is given by (in this language): two decimal digits.the behavioral complexity can be much smaller than the information complexity of a particular message—if the source provides many random digits.there must be a convention assumed about the limit of precision that is desired in describing the source. if N is large. to a discussion of the properties of physical systems. because the probabilities are real numbers and would generally require arbitrary numbers of digits to describe. the complexity of the message is high but the complexity of the source is low because we can characterize it simply as a source of random numbers. Listing all of the possible messages requires N 2N bits.Complexity of physical systems
753
description of the source. as: {1000(1:0. This could be reduced if the messages are placed in an agreed-upon order. This is still exponentially larger than the information in a particular message. We consider each measurement to be a message from the system to the observer. We now apply these thoughts about the source as the system of interest.then the number of possible messages is 2N. or could be defined by the specification itself. As we found above. rather than to make use of the information itself. because it enables us to anticipate or predict the behavior of the source. rather than the message as the system of interest. In principle. or physical probes of the system. However. We might imagine the measurements to consist of subjecting the system to light at various frequencies and measuring their scattering and reflection (looking at the system). we consider an observer of a physical system who performs a number of measurements. the length of a message. We could also specify an ASCII language source by a table of this kind that would consist of 256 elements and the probabilities of their occurrence in some database.two characters (1. then the number of bits is Q2N. the behavioral complexity of a source is much larger than the information content of a particular message. To make the connection between source and system. We are interested in the behavioral complexity when our objective is to use the messages that we receive to understand the source.We must. if the probability of each message must be independently specified. If a particular message requires N bits of information. if N = 1. observations of animals in the wild or in captivity.the number representing N (requiring log(N) characters) and several delimiters. The message complexity of this source would be given by N. 0:0. 0). This convention could be part of the language.
. Thus. or it can be much smaller.

Three
.N. after performing a number of measurements.3. and to make contact between behavioral complexity and our previous discussion of descriptive complexity.5). we must be careful how we do this because of the presence of the environmental description e. it may assign probabilities to a particular measurement. The result of any measurement can be obtained from a description of the microstate. the behavioral complexity is the ensemble complexity—the number of parameters necessary to specify its ensemble. Alternatively.754
Hu ma n Civ il iz a tion I
however. where e represents the environment and a represents a measurement of system properties (action) under the circumstances of the environment e. and the set of possible measurements determines the microstate. for a set of measurements performed over an interval of time T—or at one instant but with time determination error T—and with spatial position determination errors given by L.V ) and other parameters like magnetization that result from the breaking of ergodicity. We write any observation as a pair (e. If the entire set of measurements is performed at a single instant. and has arbitrary precision. However.3. we first consider the physical system of interest to be essentially isolated. take note that any measurement consists of two parts. This description characterizes the system. 8. we conclude that the complexity of an equilibrium system is the complexity of describing its ensemble—specifying (U. we recover the complexity profile. Since the system response a is dependent on the environment e. A particular message is a measurement of the system properties. It captures the properties of the list of measurements.a) (Fig. writes a description of the observations. rather than of one particular measurement. which in principle might be detailed enough to determine the instantaneous positions and momenta of all of the par ticles. The observer. However. It may or may not explicitly contain the information of each measurement. The list of measurements is the set {a}. More generally. the ensemble information is the information in the frozen coordinates previously defined as the complexity. In this case it is relatively easy to see that the behavioral complexity of a physical system is its descriptive complexity—the set of all measurements characterizes completely the state of the system. We would like to define the behavioral complexity as the amount of information contained in the observer’s description. For a set of measurements performed over time on an equilibrium system. The complexity of describing such messages also contains the complexity of the environment e.the conditions or environment in which the observation was performed and the behavior of the system under these conditions. and an observation consists only of the system measurement a. For a glass. then the behavioral complexity is the microstate complexity of the system. there is no doubt that the complexity of a is dependent on the complexity of e. the list of measurements is determined by the ensemble of states the system might have. Does this mean that our system description must include its environment and that the complexity of the system is dependent on the complexity of the environment? Complex systems or simple systems interact and respond to the environment in which they are found. We now return to c onsider a system that is not isolated but subject to an environmental influence so that an observation consists of the pair (e.1. As in Section 8. Then the environmental description is irrelevant. In order to clarify this point.a).

the motion of a particle suspended in a liquid follows Brownian motion.5 The observation of system behavior involves measurements both of the system’s environment. However. We consider the ensemble of messages (measurements) to have possible times of observation over a range of times given by T and errors in position determination L. Describing the ensemble of responses g ives us the behavioral complexity profile Cb(L. the description of which might be better attributed to the liquid than to the particle. The point is that the complexity of a system should not include the complexity of the influence upon it. Similarly. Clearer yet is the example of the behavior of a basketball during a basketball game. Once we realize this. we may want to attribute much of this complexity to the rest of the dog rather than to the tail. but just the complexity of its response.it is also important to characterize the information that is relevant about the environment. rather than the system. we can again affirm that a full microscopic description of the physical system is enough to give all system responses. When the environment matters.a) pairs.T ).T ) are the same. These examples generalize to the consideration of any system. However. where the function f describes its actions in response to its environment. because measuring the properties of a system in an environment may cause us to be measuring the influence of the environment. we do not characterize the system by a list of actions {a} but rather by the list of pairs {(e. a.T ) and Cb(L. rather than just the behavior itself. which is also a model of f. e.3. and the complexity can be characterized. Thus. in response to this environment Thus we should characterize a system as a function. and the system’s actions. This response is a property of the system and is determined by a complete microscopic description. 
examples illustrate how the environmental influence is important. When the influence of the environment is not important. C(L. within a range of environments and with a desired degree of precision (spatial and temporal scale) it is possible to provide less information and still describe the behavior. a full description of behavior subject to all possible environments would require complete microscopic information.C o m p l ex it y o f p h y s i c a l s ys t e m s
755
System's Environment e a Observer message (action) System e
Figure 8. rather than a list of all of its environment-action (e. a = f(e).a)} where our concern is to describe f the functional mapping a = f (e) from the environment e to the response a. The tail o f a dog has a particular motion that can be described. This is related to the problem of
. Conversely. The observer must describe the system behavior as a response to a par ticular environment. It is generally simpler to describe a model for the system structure.

because predicting the system behavior in the future requires information about the environment. in principle. We note that because the system affects the environment.T) Cb (L. This also implies that the behavior of physical systems under different environments cannot be independent. or input/output relationships to describe artificial systems.47) to include a term that describes the rate of information transfer from the environment to the system: C(L. Behavioral complexity suggests that we should consider the system behavior as represented by a function a = f (e).3.59)
where Ce(L)/k ln(2) is the information about the environment necessary to predict the state of the system at the next time step. This use of behavior/response rather than a description to characterize a system is related to the use of response functions in physics. There is a difficulty with this approach in that the complexity of functions is generically much larger than that of the system itself.”
Q
.3.T )+ N T k
i:hi >0
∑hi
(8.8).2. Because the system itself is finite.3. From the discussion in Section 8. We note that these conclusions must also apply to human beings as complex systems that respond to their environment (see Question 8. The input to the function is a description of the environment. Because the environmental influence leads to an exponentially large complexity.the amount of information about the universe that is relevant to the system behavior in any interval of time must also be finite.T) = Cb (L.756
Hu ma n Ci vi liza t io n I
prediction. (8.47).8 Discuss the following statements with respect to human beings as complex systems: “The most compact description of the system behavior will give its structure rather than its response to all inputs. which then affects the system. (8. the output is the response or action.3.the descriptive complexity is the information necessary to predict the behavior of the system over the time interval t2 − t1.59) as written may count information more than once. Thus.3 we know that the description of a function would require an amount of information given by Cf = Ca 2Ce. and Cb(L) is the behavioral complexity at one time interval. it is clear that often the most compact description of the system behavior will give its structure rather than its response to all inputs. It is more directly relevant to the system behavior in response to environmental influences. We noted this point also with respect to the Lyaponov exponents after Eq. and thus is essential for direct comparison with experimental results.this expression as written is an upper bound on the complexity.(8. The response function can (in principle) be completely derived from the microscopic description of a system.T ) = Cb (L) +C t (L. uestion 8. where Ce is the environmental complexity. Eq. and Ca is the complexity of the action. the response can be derived from the structure. We can characterize the environmental influence by generalizing Eq. Then. As we have defined it.” and “This implies that the behavior of physical systems under different environments cannot be independent.3.T)+ N T C e (L.3.

8). There are several possible ways that the observer may simplify the description of the system.8 The first statement is relevant to the discussion of behaviorism as an approach to psychology (see Section 3. In this case.C o m p l ex ity of p hys ic a l sys t e m s
757
Solution 8. Moreover. An individual who is eff ective in some circumstances may have qualities that lead to ineffective behavior under other circumstances. It is more effective to use such measurements to construct a model for the internal functioning of the individual and use this model to describe the measured responses. but the effectiveness under a variety of circumstances is limited by the interdependence of responses. Optimizing the response through adaptation to a set of environments according to some goal is a process that is limited in its effectiveness due to the coupling between responses to different circumstances.Specifically.the observer will construct a description of the system that is simpler than the system actually is. It says that the idea of describing human behavior by cataloging reactions to environmental stimuli is ultimately an inefficient approach. We will discuss this in Chapter 9 in the context of considering the specialization of human beings in society. One is to reject the
. from the second statement we know that the model can describe the responses to circumstances that have not been measured. 
8.7 The observer and recognition
The explicit existence of an observer in the definition of behavioral complexity enables us to further consider the role of the observer in the definition of complexity. This is part of what we do when we interact with other individuals—we construct models that represent their behavior and then anticipate how they will react to new circumstances. This is relevant to the observation that living organisms generally consume limited types of resources and live in particular ecological niches. then the observer will be unable to contain the description of the system that is being observed. What happens if the complexity of the system is greater than the complexity of the observer? The complexity of an observer is the number of bits that may be used to describe the observer. Increasing complexity enables an organism to be more effective. The model description is much more concise than the description of all possible responses. If the observer is described by fewer bits than are needed to describe the system. This also means that the use of such models may be effective in predicting the behavior of an individual. The coupling between the reaction of a human being under one circumstance to the reaction under a different circumstance is also relevant to our understanding of human limitations. A model that incorporates the previous behaviors may have some ability to predict the behavior to new circumstances. that reactions of a human being are not independent of past reactions to other circumstances.2. This point is also applicable more generally to living organisms and their ability to consume resources and avoid predators as discussed in Chapter 6.3.3. What assumptions have we made about the properties of the observer? One of the assumptions that we have made is that the observer is more complex than the system.

This test. A third is to treat complex variability of the source as random—described by simple probabilities.” if we want to describe the complexity of its motion under different environments. “This is it. We introduced there the concept of recognition complexity of a set that relies upon a recognizer (a special kind of TM called a predicate that gives a single bit output) that can identify the system under discussion.1. this is an important issue in the field of pattern recognition. We can assume that Turing had in mind only a limited type of interaction between the observer “we” and the systems being observed—either the real or artificial representation of a human being.2. This test suggests that we will achieve an artificial representation of intelligence when it becomes impossible to determine whether we are interacting with an artificial or actual human b eing.” We define the complexity of a system (or set of systems) as the complexity of the simplest recognizer of the system (or set of systems). Performing such extrapolations is an essential part of the use of the description of a system. An inherent problem in discussing behavioral complexity using environmental influence is that it is never possible to guarantee that the behavior of a system has been fully characterized. These simplifications are often done in our modeling of physical systems. The first comment is related to the recognition of sets of numbers introduced briefly in Section 8. all known biological organisms on earth. This is an inherent problem in behavioral complexity: it is never possible to characterize with certainty the complexity of a system under circumstances that have not been measured. “This is not it. a rock can be described as “just sitting there. we describe the relevance of recognition to complexity. Of course the nature of the environment could be changed so that other behaviors will be realized.First we realize that this definition is well suited to describing classes of systems.discover that the rock is actually a camouflaged animal. A description or model of a class of systems must identify common attributes rather than specific behaviors. for example. Finally.A second interesting feature is that the complexity of the recognizer depends on the possible universe of systems that it can be presented with. This is a general problem that applies to quantitative scientific modeling as well as the use of experience in general. or all possible systems. when presented with the system it says. A different form of complexity related to recognition may be abstracted from the Turing test of artificial intelligence. all potentially viable biological organisms. We may. For example. For example. We determine the minimal possible
. The other is to artificially limit the length of messages described. There are some interesting features of this definition.758
Human Civilization I
observation of all but a few kinds of messages.” and when presented with any other system it says. All such conclusions are extrapolations. can serve as the basis for an additional definition of complexity. which relies upon an observer to recognize the system. Specifically.3.the complexity of recognizing cows depends on whether we allow ourselves to present the recognizer with all domestic animals. Naturally. where the complexity of designing a system to recognize a particular pattern is strongly dependent on the universe of possibilities within which the pattern must be recognized. We will return to this point later when we consider the properties of human language in Section 8.4.

Our next objective is to concretize this discussion further by estimating the complexity of particular systems. In some ways this definition. is implicit in all of our earlier definitions. These approaches make use of representations that we cannot simulate.A more complete characterization of the system uses the entire complexity profile. If it is. However. One approach to obtaining the complexity of a system is to construct a representation. In all cases. We can make use of the existing information to construct estimates of their complexity.4
Complexity Estimation
There are various difficulties associated with obtaining specific values for the complexity of a particular system.however. The objective of this section is to discuss various methods for estimating the complexity of systems with which we are familiar. The explicit representation should then be used to make a simulation to show that the system behavior is reproduced. There are both fundamental and practical problems. We conclude this section by reviewing some of the main concepts that were introduced. We noted the sensitivity of complexity to the spatial and temporal scale relevant to the description or response. The “we” in the previous sentence is some observer that must recognize the system behavior in the constructed representation. The latter described the gradual rather than rapid loss of information with spatial and temporal scale. The former described the influence of microscopic information over time.
8.
. however. The complexity of this model we call the substitution complexity. Fundamental problems such as the difficulty in determining whether a representation is maximally compressed are important.Complexity estimation
759
complexity of a model (simulated representation) of the system which would be recognized by a particular observer under particular circumstances as the system. We found that the mathematical models most closely associated with complexity—chaos and fractals—were both relevant. Ultimately we must determine whether a particular representation of the system is faithful. the complexity measures the length of a representation of the system. The complexity profile formally takes this into account. If necessary. We also reconciled the notion of information as a measure of system complexity with the notion of complex systems as composed out of interdependent parts. We can hope. Measuring complexity is an experimental problem. The sensitivity of this definition to the nature of the observer and the conditions of the observation is manifest. they do have recognizable relationships to the system. we can define the unique complexity of a system to be its complexity profile evaluated at its own scale. The only reason that we are able to discuss the complexity of various systems is that we have already made many measurements of the properties of various systems. A specific estimation method is not necessarily useful for all systems. before this is an issue we must first obtain a representation. that it will not be necessary to obtain explicit representations in order to estimate complexities. however.then we know that the length of the representation is an upper bound on the complexity of the system.

where the principle example is the genome of living organisms. Each of these methods has flaws that will limit our confidence in the resulting estimates. These rough estimates will give us a first impression of the degree of complexity of many of the systems we would like to understand. When appropriate. While we will discuss the complexity of various systems.
. It would tell us how difficult (very roughly) they are to describe. since most of the information in the entropy is needed to describe the position of molecules of water undergoing vibrations. For example. We would be very happy to have an estimate of complexity such as 103±1 or 107±2. we have replaced microscopically each atom where it was.10 10±2 bits will be obtained by combining the results of different estimation techniques in the following sections. This is the value of S /k ln2. This is roughly the entropy of a similar weight of water. The entropy of a human being is much larger than the complexity estimate we are after. we keep track of half-decades using factors of three. by noting that the complexity of a human being can be bounded by the physical entropy of the collection of atoms from which he or she is formed. about 1031 bits. This means that our errors will be in the exponent rather than in the number itself. and (3) use of component counting. observing only at this spatial and temporal scale decreases dramatically the relevance of the microscopic description. we can guess that the complexity is significantly smaller than the entropy. As usual.3. (2) use of a natural representation tied to the system existence.5. we have assumed that there is nothing associated with a human being except the material of which he or she is formed. The meaning of this number is that if we take away the person and we replace all of the atoms according to a specification of 1031 bits of information.and the relaxation time of the behavior on this same length scale. To estimate the relevant complexity.the maximum visual sensitivity of a human being is about 1/100 of a second and 0.however. We will discuss the implications for artificial intelligence in Section 8.1 mm. According to our understanding of physical law. we must consider other techniques. However. where we consider whether a computer could simulate the dynamics of atoms in order to simulate the behavior of the human being. because we are interested in the complexity at a relevant spatial and temporal scale. there can be no discernible difference. For either case. such as in 3 × 104. In general we consider the complexity of a system at the natural scale defined in Section 8. and that this material is described by known physical law. The implications of obtaining an estimate of human complexity will be discussed in Section 8. Consistency of different methods will give us some confidence in our estimates of complexity.760
H uma n Ci vi liza t io n I
Our objective in this section is limited to obtaining “ballpark” estimates of the complexity of systems. one-tenth the size of the system itself. Our final estimate.4. However. we can still take advantage of them. This entropy is an upper bound to the information necessary to specify the complete human being.4. We start. We will discuss three methods—(1) use of intuition and human language descriptions. since we are trying to find rough estimates.4.our focus will be on determining the complexity of a human being. We could also define the complexity by the observer. The reduction in information is hard to estimate directly.4.

Even though it appears highly arbitrary. It could be argued that much of our development is directed toward enabling us to construct predictive models of various parts of the environment in which we live. In particular. Tens to hundreds of types are compared in a single book. we may go to the library and find descriptions that are provided in books. To gain additional confidence in this approach. we could ask the question in the following way: How much text is necessary to describe the behavior of a frog? We might emphasize for clarification that we are not interested in comparative frogology.C o m p l e xi ty es t ima tio n
761
8.2.g. We ask someone what they believe the complexity of the system is. or molecular frogology. This approach. In order for someone to give a quantitative estimate of the complexity of a system. e.. Superficially. where various types of.4. we cannot use this approach directly to estimate the complexity of human beings. we should not dismiss this approach too readily because human beings are designed to understand complex systems. which is much more sophisticated than many explicit representations that might be constructed. It is assumed that the person we ask is somewhat knowledgeable about the system and also about the problem of describing systems.g. We are quite used to using the word “complexity”in a qualitative manner and even in a comparative fashion—this is more complex or less complex than something else.a book devoted to tigers may describe only behavior (e. The information can be quantified in terms of representations people are familiar with—the amount of text/the number of pages /the number of books. It is not accidental that this is the fundamental objective of science—behavior prediction. However.1 Human intuition—language and complexity
The first method for estimation of complexity—the use of human intuition and language—is the least controlled/scientific method of obtaining an estimate of the complexity of a system. One useful and intuitive definition of complexity is the amount of information necessary to describe the behavior of a system. from around the world. The books on insects are devoted to comparative descriptions.is precisely what was asked in Question 8. not physiology). On the other hand. The amount of text devoted to the behavior of a par ticular t ype of mosquito could be readily contained in less than a single chapter. For example. The complexity of a system is directly related to the amount of study we need in order to master or predict the behavior of a system.in its most basic form. and one devoted to apes
. We are just interested in a description of the behavior of a frog. butterfly). Thus we will focus on simpler animals first. ant. There is an inherent limitation in this approach mentioned more generally above—a human being cannot directly estimate the complexity of an organism of similar or greater complexity than a human being. and/or their evolutionary history are described. as there are books devoted to the tiger or the ape. Exceptional behaviors or examples are highlighted. there is a qualitative difference between these books. we find that there are entire books devoted to a particular t ype of insect (mosquito.. mosquitoes.1. their physiology.it is necessary to provide a definition of complexity that can be readily understood. What is missing is the quantitative definition. This can be sufficient to cause a person to build a rough mental model of the system description.

Before applying this methodology. We can summarize these bounds as 0.3 bits per character using a Markov chain model that included correlations between adjacent characters. For larger quantities of text. Does the conventional wisdom of “a picture is worth a thousand words” make sense? We can consider this both from the point of view of direct compression of the picture.3 bits/character.the use of existing books is a reasonable first attempt to obtain complexity estimates from the information that has been compiled by human beings. To obtain an even better estimate. For our present discussion. we need to have a model that includes longer-range correlations between characters. Mosquitoes are easier to study in captivity and are more readily available in the wild. the difference between high and low bounds (a factor of 2) is not significant. It is assumed that people have a highly sophisticated model for the structure of English and that the individual has no specific knowledge of the text.762
H uma n Ci vi liza t io n I
would describe only a particular individual in a manner that is limited to only part of its behaviors.8. The guesses were used to establish bounds on the information content.1 Information estimates for straight English text and illustrated text. 
.8 bits per character could be based upon the existence of 26 letters and 1 space. Our estimate of information in text has assumed a strictly narrative English text. In Question 1.1. and the
Amount of text 1 char 1 page = 3000 char 1 chapter = 30 pages 1 book = 10 chapters
Information in text 1 bit 3x103 bit 105 bit 106 bit
Text with figures 104 3x105 3x106
Table 8. this corresponds to values given in Table 8. We can also argue that when there is greater experience with complexity and complexity estimation.12. The most reliable estimates have been obtained by asking people to guess the next character in an English text. The mosquito is much more relevant to the well-being of human beings than the tiger. we should understand more carefully the basic relationship of language to complexity.There are films that enable us to observe the mosquito behavior at its own scale rather than at our usual larger scale.4. We should also be concerned about figures that accompany descriptive materials. however.there is no book-length description of the behavior of a mosquito.A first estimate of 4. We have already discussed in Section 1.8 the information in a string of English characters.4. Despite such films. Even if there is some degree of subjectivity to the complexity estimates obtained from the lengths of descriptions found in books. Does the difference in texts describing insects and tigers reflect the social priorities of human beings? This appears to be difficult to support.9±0.our ability to use intuition or existing texts will improve and become important tools in complexity estimation.the best estimate obtained was 3. This is true despite the importance of knowledge of its behavior to prevention of various diseases. For convenience we will use 1 bit/character for our conversion factor.

5 to roughly 3 × 106 bits.In a black and white photograph 5 × 103 bits would correspond to a 70 × 70 grid of completely independent pixels. This is.
. Thus when we ask the key question—whether two pages of text would be sufficient to describe a typical figure and replace its function in the text—this seems a somewhat generous but not entirely unreasonable value. this seems reasonable as an upper bound.A word is a member of an ensemble of words. We will do so by increasing the previous values by a factor of 3 (Table 8. If we recall that we are not interested in small details. For example. This will not change any of the conclusions. on average containing one figure and one-half page of text on each page. Here we have an example of a system—frog—whose representation “frog” is manifestly smaller than its complexity. All words are names of more complex entities. If there is one picture on every two pages.it is only necessary to invoke the name to retrieve the whole play. our estimate of the information content of the book would increase from 106 bits by a factor of 2. Thus. This is the central point of recognition complexity. There is another aspect of the relationship of language to complexity.the complexity of a word is not related to the complexity of the system. to describe a system one must identify it only in comparison with the systems in memory. Thus. the smallest possible representation of a complex system has a length in bits which is equal to the system complexity.3.indeed. Why is this possible? According to our discussion of algorithmic complexity.a bo ut two pages of text.1). A figure typically occupies half of a page that would be otherwise occupied by text. A thousand words corresponds to 5 × 103 characters or bi t s . It is only necessary that the ensemble of words be matched to the ensemble of systems described by the words.C o m p l ex ity es ti ma ti on
763
possibility of replacing the figure by descriptive text. Another way to think about this is to consider a human being as analogous to a special UTM with a set of short representations that the UTM can expand to a specific limited subset of possible long descriptions.and the systems that are described by these words are an ensemble of systems. not with those possible in principle. it seems reasonable to adopt the convention that estimates using descriptions of behavioral complexity include figures. For a human being with experience and memory of only a limited number of the set of all complex systems. Moreover. but rather to the complexity of specifying the system—the logarithm of the number of systems that are part of the shared experience of the individuals who are communicating. Descriptive figures such as graphs or diagrams often consist of a few lines that can be concisely described using a formula and would have a smaller complexity.not the whole ensemble of possible systems. While it is not really essential for our level of precision. having memorized a play by Shakespeare. A language uses individual words (like “frog”) to represent complex phenomena or systems (like the physical system we call a frog). the information content of the book would be doubled rather than tripled. The complexity of the word “frog” is not the same as the complexity of the frog.7.4. Photographs are formed of highly correlated graphical information that can be compressed. The resolution of this puzzle is through the concept of recognition complexity discussed in Section 8.the essence of naming—a name is a short reference to a complex system. the text that accompanies a figure generally describes its essential content. for a highly illustrated book.

Ultimately. Nevertheless.the words that would be used to describe a frog also refer to complex entities or actions. including:
. that the more complex a system is.Specifically. This follows because o f our discussion of universality o f complexity g iven in Section 8. the estimation problems becomes less severe. the estimate of system complexity should also include the complexity o f the compiler and of the computer operating system and hardware. There is another approach to the use of human intuition and language in estimating complexity. and the more universal are estimates of complexity.” On the other hand. the length of the program that can simulate a frog. and take the given estimates at face value.however. we can ask for the length of the computer program that can simulate the behavior of the system—more specifically. but by program lines at several bits per program line. we assume that we must describe it without reference to the name itself. but also the meaning of each of the words used to describe the behavior of the frog. We can bypass this problem by considering instead the size of the execution module—after application of the compiler. This implies that we should not use the length of a word to estimate the complexity of a system that it refers to. For someone familiar with computer simulation. Thus we might estimate the complexity of a program not by characters. it is more constructive to keep in mind the subtle relationship between language and complexity as part of our uncertainty. There is a presumption that a description of behavior is made to someone without specific knowledge. There are other problems with the use of natural or artificial language descriptions. because there are a few commands and variables that are used throughout the program. ultimately we will conclude that the inherent compression in use of language f or describing familiar complex systems is the greatest contributor to uncertainty in complexity estimates. Compilers and operating systems are much more complex than many programs by themselves. we can argue that when we estimate the complexity of systems that approach the complexity of a human being. when we are asked to describe the behavior of a frog.the less relevant specific knowledge is. Finally. We accept the possibility that language-based estimates of complexity of biological organisms may be systematically too small because they are common and familiar. the complexity of a system is defined by the condition that all possible (in principle) behaviors of the same complexity could be described using the same length of text.“It behaves like a frog” is not a sufficient description. We may nevertheless have relative complexities estimated correctly.764
Human Civilization I
In this way. Computer languages are generally not very high in information content.2. We could expand the description further by requiring that a person explain not only the behavior of the frog. At this point. Does this also invalidate the use of human language to obtain complexity estimates? On one hand. Consistency in different estimates of the amount of text necessary to describe a frog might arise from the use of a common language and experience. language provides a systematic mechanism for compression of information. An estimate of the complexity of a frog would be much higher than the complexity of the word “frog.2. This is by reference to computer languages. Consistent with the definition of algorithmic complexity.

the presumption that this limited knowledge is complete will lead to a complexity estimate that is too low. lack of knowledge may also result in too high estimates if the individual extrapolates the missing knowledge from more complex systems. most mammals at approximately a book. Difficulty with counting.Complexity estimation
765
1. This problem is related to the difficulty of determining the compressibility of information.4. 2. Alternatively. This is the advantage of identifying numbers with length of text. Large numbers are generally difficult for people to imagine or estimate. 3. Underestimation due to lack of knowledge of the full behavior of the system. This choice of representation may not be the most compact. These numb ers span the range of complexity estimates. We can apply this method by taking the highest complexity estimate o f other systems and using this as a close lower bound to the complexity of the human being. According to our
Animal Fish Grasshopper.2 Estimates of the approximate length of text descriptions of animal behavior 
.4. Insects and fish are at pages. By close lower bound we mean that the actual complexity should not be tremendously greater. and monkeys and apes at several books. the assumption may be in the conceptual (semantic) framework. which is generally a more familiar quantity. this is not quite true. We have concluded that it is not possible to use this approach to obtain an estimate of human complexity. Overestimation due to a lack of knowledge of possible representations. An example is the complexity of the motion of the planets in the Ptolemaic (earth-centered) representation compared to the Copernican (sun-centered) representation.frogs at a chapter. If an individual is familiar with the behavior of a system only under limited circumstances. Alternatively. With all of these limitations in mind. what are some of the estimates that we have obtained? Table 8. However. Ptolemy would give a larger complexity estimate than Copernicus because the Ptolemaic system requires a much longer description—which is the reason the Copernican system is accepted as “true” today. The lengths of linguistic descriptions of the behavior of biological organisms range from several pages to several books. The assumption of a particular length of text presumes a kind of representation. Mosquito Ant (one. This may be due to the form of the representation—specifically English text. not colony) Frog Rabbit Tiger Ape
Text length a few pages a few pages to a chapter a few pages to a chapter a chapter or two a short book a book a few books
Complexity (bits) 3x104 105 105 3x105 106 3x106 107
Table 8.2 was constructed using various books.

The estimate of roughly 30 textbooks is also consistent with the general upper limit on the number of books an individual can write in a lifetime. Considering the DNA as an alphabet of four characters provided by the four nucleotides or bases represented by A (adenine) T (tyrosine) C (cytosine) G (guanine).1. For simplicity we might think of DNA as a kind o f program that is interpreted by decoding machinery during development and operation. Thus from such text-based self-consistent evidence we might assume that the estimate of 108 bits is not wrong by more than one to two orders of magnitude. One reason this number appears reasonable is that if the complexity of a human being were much greater than this.this provides us with an estimate of the complexity of an organism.2 Genetic code
Biological organisms present us with a convenient and explicit representation for their formation by development—the genome. We can also estimate the complexity of a human being by the typical amount of information that a person can learn. a first estimate of the information contained in a DNA sequence would be N log(4) = 2N. It might also be argued that this is too high because students do not actually know the entire content of 30 textbooks.766
H uma n Ci vi liza t io n I
experience. with about 500 books. Some other perspectives are given in Section 7. There are several other approaches to estimating human complexity based upon language.9). N is the length of the DNA chain.
8. We will see how this compares to other estimates in the following sections. Taken at face value. This suggests that human complexity is somewhat larger than this latter number—approximately 10 8 bits. We first discuss the approach in somewhat greater detail. It might be argued that this estimate is too low because we have not inc luded other parts of the education (elementary and high school and postgraduate education) or other kinds of education/information that are not academic.4. The existence of book-length biographies implies a poor estimate of human complexity of 106 bits. In this regard the genome is much like a Turing machine tape (see Section 1. Since DNA is formed of two com-
. Primates may be estimated somewhat higher. It is generally assumed that most of the information needed to describe the physiology of the organism is contained in genetic information.the complexity estimates of animals tend to extend up to roughly a single book. We must then inquire as to the approximations that are being made. with a range of one to tens of books. This is in direct agreement with the previous estimate of 108 bits. The most prolific author in modern times is Isaac Asimov.there would be individuals who would endure tens or hundreds of college educations in different subjects. We now turn to estimation methods that are not based on text. even though the mechanism for transcription is quite different from the conventional Turing machine. it seems to make sense to base an estimate on the length of a college education. which uses approximately 30 textbooks. it appears natural to associate with the genome the information that is necessary to specify physiological design and function.Specifically. It is not difficult to determine an upper bound to the amount of information that is contained in a DNA sequence. or about 30 books.Regardless of how we ultimately view the developmental process and cellular function.

3 Estimates of complexity based upon genome length. A significant percentage of DNA is “non-coding. but it does suggest some of the difficulties in determining the information content even when there is a clear first numerical value to start from.a n d
.” This DNA is not transcribed for protein structures.4. Specifically. Since 30%–50% of human DNA is estimated to be coding. Aside from the increasing trend from bacteria to fungi to animals/plants.this is essentially as good an estimate as we can obtain from this methodology at present. The coding for each amino acid is given by a triple of bases. which is somewhat larger than that obtained from language-based estimates in the previous section. Since there are many more triples (43 = 64) than amino acids (twenty) some of the sequences have no amino acid counterp a rt .there is no apparent trend that would suggest that genome length is correlated with our expectations about complexity. a single number is given for the information contained in the genome. We now proceed to discuss limitations in this approach. It may be relevant to the structural properties of DNA. coli) Fungi Plants Insects Fish (bony) Frog and Toad Mammals Man
Genome length (base pairs) 10 –10 107–108 108–1011 108–7x109 5x108–5x109 109–1010 2x109–3x109 3x109
6 7
Complexity (bits) 107 108 3x108–3x1011 109 3x109 1010 1010 1010
Table 8. While this estimate neglects many corrections.Complexity estimation
767
Organism Bacteria (E.3. the estimate is nearly 10 10 bits. Direct forms of compression: as presently understood . Therefore as a rough estimate. b.Specific numbers are given in Table 8. its length is measured in base pairs. Nevertheless. What is more remarkable is that there is no systematic trend of increasing genome length that parallels our expectations of increasing organism complexity based on estimates of the last section.4. The list of approximations given below is not meant to be exhaustive. there are a number of assumptions that we are making about the organism that give a larger uncertainty than some of the corrections that we can apply. a. Except for plants. it is likely that information in most of the base pairs that are non-coding is not essential for organism behavior. We see that for a human being. they can be replaced by many other possible base pair sequences without effect. where there is a particularly wide range of genome lengths. 
plementary nucleotide chains in a double helix. Genome lengths and ranges are representative. because the accuracy does not justify more specific numbers.D NA is primarily utilized through transcription to a sequence of amino acids. It may also contain other useful information not directly relevant to protein sequence. this correction would r educe the estimated complexity by a factor of two to three.

or at the molecular scale at which the enzymatic function is relevant. e. 107 repetitions) forming a significant fraction of their genome. To improve the estimate further. There are organisms that are known to have highly repetitive sequences (e. since it does not code the atomic vibrations or molecular diffusion. This redundancy means that there is less information in the DNA sequence. We will mention this limitation again in point (d).g. d. we can ask if there are protein components/subchains that can be used in more than one protein. This might suggest that some degree of compression is performed in order to reduce the complexity of transmission of the information from generation to generation. This discussion is approaching issues of the scale at which complexity is measured—at the atomic scale where the specific amino acid is relevant. General compression: more generally. Other forms of compression may also be relevant.. this is not a proof. Scale of representation:the genome codes for macromolecular and cellular function of the biological organism.. Presumably. it would be interesting if this could be represented using a chemical equivalent of (18)asp. the DNA is likely to be coding a far greater complexity than we are interested in for multicellular
. since our concern is for the organism’s macroscopic complexity. For example. There is evidence that the genome does uses this property for compression by overlapping the regions that code for several different proteins. and correlations between them. and one could also argue in favor of redundancy in order to avoid susceptibility to small changes. Much of this may be non-coding DNA. However. we can ask how compressed the DNA encoding of information is. This is much less than the microscopic entropy. asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-asp-aspasp-asp-asp-asp-asp. This may be much less than the information necessary to specify its primary structure (amino acid sequence).ifa molecule that is to be represented has a long chain of the same amino acid. Related to the issue of DNA code compression are questions about the complexity of protein primary structure in relation to its own function—specifically. the information that describes the pattern of transcriptions is represented in the noncoding segments that are between the coding segments. For example. how much information is necessary to describe the function of a protein. However. Taking this into account by assigning a triple of bases to one of twenty characters that represent amino acids would give a new estimate of (N/3)log(20) = 1. Moreover there are likely to be inherent limitations on the compressibility of the information due to the possible transcription mechanisms that serve instead of decompression algorithms. This requires a transcription mechanism that repeats segments—a DNA loop.4N. we would include the relative probability of the different amino acids. c. This is relevant to the general redundancy of protein design. We can rely upon a basic optimization of function in biology. A particular region of DNA may have several coding regions that can be combined in different ways to obtain a number of different proteins.768
Hu ma n Civ il iz a tion I
there are more than one sequence that map onto the same amino acid. Transcription may start from distinct initial points.g.

the molecular and cellular behavior is generally repeated throughout the organism in different cells. It is clear. The additional information gained in this way would have to play a relatively minor functional role if there is significance to the genetic control over physiology.a complete estimate of the complexity of a system must include this information. then the error in estimating the complexity would be quite large. however.the greater part of the DNA representation would be devoted to describing the cellular behavior. f. The information in the nuclear DNA dominates over the mitochondrial DNA. In humans. including the transcription mechanisms. Thus.it may be assumed that the greatest part of the DNA code represents the macroscale behavior.C o m p l e xi ty est i mati on
769
organisms. causing a small correction to our estimates.and it is not clear how much information is necessary to specify their function. To the extent that the complexity of cellular behavior is smaller than that of the complete organism. however. e. even organisms that have the same DNA are not exactly the same.. It is possible. Without considering different scales of structure or behavior. the information in cellular structures is more likely to be irrelevant for organisms whose complexity is high. it may very well be that the description of all other parts of the cell. during cell division not only the DNA is transferred but also other cellular structures.104–106 bits compared to 107–1011 bits in DNA). Nevertheless. on the macroscale we should
. Here we are not considering the macroscale environmental influence. This influence begins with the randomness of molecular vibrations during the developmental process. if the organism behavior is comparatively simple. nuclear DNA and mitochondrial DNA. We have implicitly assumed that the development process of a biological organism is deterministic and uniquely determined by the genome. If the DNA were representing the sum of the molecular or cellular scale complexity of each of the cells independently. that the other sources of information approach some fraction (e.g. but rather the microscale influence.Similar to our point (d). 10%) of the information in the nuclear DNA. Otherwise it would be possible to transfer DNA from one cell into any other cell and the organism would function through control by the DNA. The assumption is that much of the cellular chemical activity is not relevant to a description of the behavior on the scale of the organism. the DNA is essentially representing the complexity of a single cellular function with the additional complication of representing the variation in this function. and we also expect it to dominate over other sources of cellular information. Randomness in the process of development gives rise to additional information in the final structure that is not contained in the genome. Completeness of representation: we have assumed that DNA is the only source of cellular information. identical twins have been studied in order to determine the difference between environmental and genetic influence. However. Thus. This is not the case. However. On the other hand. only involves a small fraction of the information content compared to the DNA (for example. We could note also that there are two sources of DNA in the eukaryotic cell. that DNA does not contain all the information. However.

We might adopt one of two approaches to understanding this result: first. or cannot make use of. By our discussion of the complexity profile in Section 8. the salamander. which is the only vertebrate with the ability to regenerate limbs.770
H uma n Ci vi liza t io n I
not expect the microscopic randomness to affect the complexity by more than a factor of 2. A more general reason for the high plant genome complexity that is consistent with regeneration would be that plants have systematically developed a high complexity on smaller (molecular and cellular) rather than larger (organismal) scales.and more likely the effect is not more than 10% in a typical biological organism. This
. Therefore. there must be a general quality of plants that has higher descriptive and behavioral complexity.the environmental influences on behavior are believed to be small compared to genetic influences. that plants are actually more complex than animals. One reason for this would be that plant immobility requires the development of complex molecular and cellular mechanisms to inhibit or survive partial consumption by other organisms. This is in conflict with the conventional wisdom that animals have a greater complexity of behavior than plants. A candidate for such a property is that plants are generally able to regenerate after injury. and second. If plants are systematically more complex than animals. From a programming point of view. For most biological organisms. (b). (f ) and (g) imply it is an underestimate.the genetic estimate becomes less reliable as an upper bound for human beings than it is for lower animals. This idea might be checked by considering the genome length of animals that have greater ability to regenerate.there must be some form of actual blueprint for the organism encoded in the genome that takes into account many possible circumstances. We have also neglected the macroscale environmental influences on behavior. In essence. Assumptions discussed in (e). compression algorithms that are present in animal cells. It is presumed that the structure of animals has such a high intrinsic complexity that representation of a fully regenerative organism would be impossible. We can see that the assumptions discussed in (a). this is a multiply reentrant program. Instinctive behaviors dominate. the explanation would be supported. One of the conceptual difficulties that we are presented with in considering genome length as a complexity estimate is that plants have a much higher DNA length than animals. (c) and (d) would lead to the DNA length being an overly large estimate of the complexity. This point will be discussed in greater detail below. To enable this feature may very well be more complex. and comparable to that of the largest plant genomes. that the DNA representation in plants does not make use of. This inherently requires more information than the reliance upon a specific time history for development. a hig h complexity on small scales would not allow a high complexity on larger scales. or it may require a more redundant (longer) representation of the same information. This is much larger than that of other vertebrates. has a genome of 1011 base pairs. These are usually described by adaptation and learning . g. If they are substantially longer than similar animals without the ability to regenerate. Indeed. This is not as true about many mammals and even less true about human beings.3.

In principle. we can develop an understanding of the complexity of the system. as with other estimation methods. we cannot count the number of parts on the scale of the organism (one) because the problem in determining the complexity remains in evaluating C0. one of the important clues to the complexity of the system is its composition from elements and their interactions.f. If we are to consider the behavioral complexity of a human being by counting components. to see why this would be the case in all but the simplest models of evolution. Widely different types of organisms have similar genome lengths. while similar organisms may have quite different g enome lengths.this is reduced both by correlations between elements and by the change of scale from that of the elements to that of the system. Increases in organism complexity then result from fewer redundancies and better compression.3 Component counting
The objective of complexity estimation is to determine the behavioral complexity of a system as a whole. One of the most striking features of the genome lengths found for various organisms is their relative uniformity.
8. we must identify the relevant components to count.and not in animals. On the other hand.the human genome length provides us with an estimate of human complexity. This is a potential explanation for the relative lengths of plant genome and animal genome.A protein formed out of a long chain of the same amino acid might be functionally of importance in plants. It is hard. points (e). Regardless of the ultimate reason for various genome lengths. (f ) and (g) above). we would be describing the microscopic complexity.3. in each case the complexity estimate from genome length provides an upper bound to the genetic component of organism complexity (c. If we count the number of atoms. One explanation for this that might be suggested is that genome lengths have increased systematically with evolutionary time. However. We will discuss these problems in the context of estimating human complexity. It makes more sense to infer that there are constraints on the genome lengths that have led it to gravitate toward a value in the range 109–1010. as we discussed in Section 8. this could account for the pattern of complexities we have obtained. Thus
. We will find that this method gives us a much higher estimate than the other methods. In using this method we are faced with the dilemma that lies at the heart of the ability to understand the nature of complex systems—how does complex behavior arise out of the component behavior and their interactions? The essential question that we face is: Assuming that we have a system formed of N interacting elements that have a complexity C0 (or a known distribution of complexities). Thus.4. However. however.it must be understood that there are inherent problems in this approach. rather than longer genomes.how can the complexity C of the whole system be determined? The maximal possible value would be NC0. However. By counting the number of elements.C o m p l exi ty es ti ma ti on
771
explanation would also be consistent with our understanding of the relative simplicity of plants on the larger scale. This might be the result of particular proteins with chains of repetitive amino acids. The second possibility is that there exists a systematic additional redundancy of the genome in plants.

on average.the behavior of the system on the scale of the organism is generally attributed to the nervous system. There are several problems with applying this formula to biological nervous systems. the synapses of a neuron largely connect to neurons that are nearby. Its behavior is specified by whether it is ON or OFF. Each neuron responds to particular neurotransmitters. if. we must discuss a specific model for the nervous system and then determine its limitations. The total complexity of the synapses could be quite high if we allowed the synapses to have many digits of precision in their values.aside from an inconsequential number of additional parameters. The behavior of the network is. and the synapse b etween two specific neurons is different from other synapses. In contrast. We can do this by considering the behavior of a model system we studied in detail in Chapter 2—the attractor neural network model. For the human brain where Ns has been estimated at 104 and N ≈ 1011. Of the natural intermediate scales to consider. The first difficulty is that the complexity of behavior does not arise equally from all cells. but this does not contribute to the complexity of the network behavior. then it would be natural to also consider the immune system. and should scale with the number of synapses. we need to specify not only the imprinted patterns but also which synapses are present and which are absen t . The second major problem with this model is that real neurons are far from binary variables. we can argue that the maximal number of independent parameters that may be specified for the operation of the network consists of the neural firing patterns that are stored. a neuron is a complex system.and c ≈ 0. Thus. this would give a value of 0. The problem with this estimate is that in order to specify the behavior of the network. and their contribution to organism behavior can be summarized simply. where N is the number of neurons.14 is a number that arose from our analysis of network overload. for a neuron. Given our investigation of the storage of patterns in the network. We could apply a similar formula to the network assuming only the number of synapses Ns that are present.1 × 104 × 1011 = 1014 bits.772
H uma n Ci vi liza t io n I
the objective is to select components at an intermediate scale. They may therefore collectively be described in terms of a few parameters.L i s ting the synapses that are present would require a set of number pairs that would specify which neurons each neuron is attached to. In order to make more progress. We will use 1016 as the basis for our complexity estimate.Each of the neurons is a binary variable.and how relevant are these parameters to the complexity
. we will consider only the cells of the nervous system. there are molecules. It is generally understood that muscle cells and bone cells are largely uniform in structure. Indeed.however. which is larger than the number of bits of information in the storage itself. cells and organs.This gives a value c Ns N. The first is that the biological network is not fully connected.as we discussed in Chapter 2. How many parameters would be needed to describe the behavior of an individual neuron. If we were considering the behavior on a smaller length scale. This estimate may be reduced by a small amount. described by the values of the synapses. This corresponds to c N 2 bits of information. This means that the storage capacity of the network is smaller. We will tackle the problem by considering cells and discuss difficulties that arise in this context. This list would require roughly NNs log(N) = 3 × 1016 bits. as we expect.

we can describe the processing of the visual field in terms of a small number of parameters.the complexity of the internal structure of a neuron is not greater than the complexity of its interconnections.
. we should consider the complexity of a synapse. the use of this reduced description of the visual processing would reduce the estimate of the complexity of the whole system. then the complexity is reduced. we obtain an estimate for complexity of 1016 bits. Similarly. Multiplying this by the total number of synapses (10 15) gives 1016 bits. we might think that taking into account the complexity of individual neurons g ives a much higher complexity than that considered ab ove. This would be greater than 10 16 bits only if the complexity of the individual neurons were larger than 105. By these estimates. Even if there are smooth variations in the parameters that describe both the neuron behavior and the synapses between them. As we mentioned before.Complexity estimation
773
of the whole system? Naively. A reasonable estimate of the complexity of a neuron is roughly 103–104 bits. the idea behind this construction is that whenever there are many neurons whose behavior can be grouped together into particular functions. and it is necessary to specify the parameters of all of the neurons. it is not clear how many parameters remain once this categorization has been done. this is not the case. Both the description of an individual neuron and the description of the synapses between them can be drastically simplified i fa ll of them follow a pattern.the initial visual processing does not involve more than 10% of the number of neurons. Then the complexity of the whole system would include C0N bits for the neurons themselves. Even if we eliminate all of their parameters.the structure of synapses and the list of synapses present.We assume that the parameters necessary to describe an individual neuron correspond to a complexity C0.the estimate of system complexity would not change. It is known that neurons can be categorized. However. This would be sufficient to specify the synaptic strength and the type of chemicals involved in transmission. giving rise to the possibility of compression of the description. which multiplies the number of synapses. This would give a value of C0N = 1013−1014 bits.one would guess (an intuition-based estimate) that processing of the visual field is quite complicated (more than 102 bits) but would not exceed 103–105 bits altogether. which is not a significant amount by comparison with 1016 bits. Synapses are significantly simpler than the neurons. We may estimate their complexity as no more than 10 bits. Indeed. Combining our estimates for the information necessary to specify the structure of neurons. Since a substantial fraction of the number of neurons in the brain is devoted to initial visual processing. This is the same as the information necessary to specify the list of synapses that are present. the visual system involves processing of a visual field where the different neurons at different locations perform essentially the same operation on the visual information.However. there are two fundamental difficulties with this approach that make the estimate too high— correlations among parameters and the scale of description. Many of the parameters enumerated above are likely to be the same. For example. Nevertheless. Thus if we can describe neurons as belonging to a particular class of neurons (category or stereotype). This estimate is significantly larger than the estimate found from the other two approaches.however.then the complexity of the description is reduced.

in a system that is subdivided by virtue of having fewer synapses between subdivisions. If the number of parameters necessary to describe the network greatly exceeds the number of parameters in the genetic code.999% of neuron parameters are irrelevant to human behavior. Eventually we may ask whether the objective is to represent the specific information known by an individual or just his or her “character. however. we might also realize that this discussion is relevant to the consideration of the influence of environment and genetics on behavior. In comparison with a fully connected network. our estimate of behavioral complexity should raise questions such as. How specific do we have to be? Should the content of short-term memory be included? The argument in favor would be that we need to represent the human being in entirety. The estimate we have obtained for the complexity of the nervous system is relevant to a description of its behavior on the scale of a neuron (it does. or more specifically which neurons (or which information) are proximate to which. we could also argue that it would be hard for us to notice the impact of this loss. we need a method to assess the dependence of the organism behavior on the cellular behavior. This suggests that individual neurons are not crucial to determining human behavior. Human beings are believed to lose approximately 106 neurons every day (even without alcohol) corresponding to the loss of a significant fraction of the neurons over the course of a lifetime. 1016. A natural approach might be to evaluate the robustness of the system behavior to changes in the components. This is too great a discrepancy to dismiss based upon such an argument.the substructure of the system has already been included.as we have done above. While this may be a small part of the total information. then many of these parameters must be specified by the environment. we comment that parameters that describe the nervous system also include the malleable short-term memory. once we have counted the information that is present in the selection of synapses. focus on cellular behavior most relevant to the behavior of the organism).774
Human Civilization I
When we think about grouping the neurons together. is eight orders of magnitude larger than the estimates obtained from text and six orders of magnitude larger than the genome-based estimate.
. It implies that there may be a couple of orders of magnitude between the estimate of neuron complexity and human complexity. a network with substructure is more complex because it is necessary to specify the substructure. In order to overcome this problem. The second problem of estimating complexity based on component counting is that we do not know how to reduce the complexity estimate based upon an increase of the length scale of observation. In any event.To account for this difference we would have to argue that 99. On a more philosophical note. The argument against would be that what happened in the past five minutes or even the past day is not relevant and we can reset this part of the memory.” We have not yet directly addressed the role of substructure (Chapter 2) in the complexity of the nervous system. since the daily loss of neurons corresponds only to a loss of 1 in 105 neurons. which is only 1010 bits. However. our estimate based upon component counting. However. We will discuss this again in the next section.

59). because genome information is compressible and because much of it must be relevant to molecular and cellular function. We will discuss the discrepancies between these numbers and conclude with an estimate of 1010±2 bits.47) and Eq. There are of order 109 seconds in a lifetime.3. This would seem to be a very generous estimate. This suggests either that the collective behavior of neurons requires redundant information in the synapses. We have found that the microscopic complexity of a human being is in the vicinity of 1030 bits. or that the actions of an individual do not fully represent the possible actions that the individual would take under all circumstances. Under these circumstances. and we allow for each second 103 bits of information.and no patterns of behavior exist. Both of our approaches to component counting (spatial and temporal) may overestimate the complexity due to this problem. genome-based 1010 bits and component (neuron)counting 10 16 bits. the potential complexity of a system under the most diverse set of circumstances is not necessarily the observed complexity.4 Complexity of human beings. This is much larger than our estimates of the macroscopic complexity—language-based 10 8 bits.1011 neurons. For a neuron reaction time of order 10−2 seconds.(8. we have 1022 bits of information.3. we replace the spatial component-counting estimate with the time-counting upper bound of 1012 bits.and then turn to some more philosophical considerations of its significance. at most. artificial intelligence. Thus we see that the total amount of information that passes through the nervous system is much larger than the information that is represented there.as discussed in Section 8. and therefore contains comparable or greater information.C o m p l e xi ty es t ima tio n
775
Finally.and 109 seconds in a lifetime. This can be estimated as the total number of neuronal states over the course of a lifetime. We can contrast this number with an estimate of the total amount of information that might be imprinted upon the synapses. This estimate assumes that each second is independently described from all other seconds. The languagebased estimate is likely to be somewhat low because of the inherent compression achieved by language. The genome-based complexity is likely to be a too-large estimate of the influence of genome on behavior.3.6.One way to say this is that a college education. and the soul
We begin this section by summarizing the estimates of human complexity from the previous sections. because information may cycle between scales or between system and environment.
8. Thus we conclude that only. The latter possibility returns us to the discussion of Eq. We can summarize our understanding of the different estimates. where we commented that the expression is an upper bound. consisting of 30 textbooks. As discussed at the end of the last section. The component-
.1012 bits of information are necessary to describe the actions of a human.4. which is larger than the information that is manifest in terms of behavior. we can demonstrate that 1016 is too large an estimate of complexity by considering the counting of time rather than the counting of components.(8. is based upon childhood learning (nonlinguistic and linguistic) that provides meaning to the words. We consider a minimal time interval of describing a human being to be of order 1 se cond.

It is redundant. We have just estimated the amount of this information.there is an implication of its possibility.776
Hu man Civ il iz a tion I
counting estimate suggests that the information obtained from experience is much larger than the information due to the genome—specifically. Also. Our estimate of the complexity of a human being is 1010±2 bits.the actual value is less important than the existence of an estimate. in principle. However. The main final caveat is that the difficulty in assessing the possibility of information compression may lead to a systematic bias to high complexities. but should still be significantly less than 1016 bits. Instead it is closely related to the assumptions of this field.Specifically.and science and popular thought. predict the behavior of a human being. Consideration of the complexity of a human being is intimately related to fundamental issues in artificial intelligence. If the material of which the human being is made were essential to its function. The complexity of a human being specifies the amount of information necessary to describe and. Some of these conflicts arise because of the supposition by some religious philosophers of a nonmaterial soul that is presumed to animate human beings. There is no presumption that the prediction would be feasible using present technology.not measurable. The notion of reproducing human behavior in a computer (or by other artificial means) has traditionally been a major domain of confrontation between science and religion. In this way the description of a soul suggests an abstraction of function from matter which is consistent with abstractions that are familiar in science and modern thought. given an environment.We will see that such a concept is not necessarily in conflict with notions of artificial intelligence. Such nonmaterial entities are rejected in the context of science because they are. but might not be consis-
. we will consider the possibility of a scientific definition of the concept of a soul. Our objective here is to briefly discuss both philosophical and practical implications of this observation. This is consistent with our discussion in Section 3. then there would be no independent functional description.11 that suggested that synapses store learned information while the genome determines the overall structure of the network. that genetic information cannot specify the parameters of the neural network. One way to define the concept of soul is as the information that describes completely a human being. the complexity for describing the response to arbitrary circumstances may be higher than the estimate that we will give. The error bars essentially bracket the values we obtained.the existence of a soul represents the independence of the human being from the material of which he or she is formed. we must realize that the concept of soul serves a purpose. Because of this last point. When an individual dies. To understand how this is related to the religious concept of soul. For the following discussion. there would be no mechanism by which we could reproduce human behavior without making use of precisely the atoms of which he or she was formed. and /or does not manifest itself in human behavior because of the limited types of external circumstances that are encountered. It may be helpful to discuss some of the alternate approaches to the traditional conflict that bypass the controversy in favor of slightly modified definitions.2. by definition. We must still conclude that most of the network information is not relevant to behavior at the larger scale.

Which human being did Turing have in mind? We can go beyond this objection by recognizing that in order to fool us into thinking that the computer is a human being. we suspect that the real conflict between the approaches resides in a different place. a profession. we may also ask whether the represented human being is someone we already know. except for a very casual conversation. the computer would have to represent a single human being with a name. we anticipate that the features characteristic of human behavior are predominantly sp ecific to each individual rather than common. these atoms may be replaced by other indistinguishable atoms and the same behavior will be found. not an abstract notion of intelligence.A human being is not directly tied to the material of which he is made. Instead there is a functional description that can be implemented in various media. The simplest possible abstraction would be to state (as is claimed by physics) that the specific atoms of which the human being are formed are not necessary to his or her function. It would be quite easy to reproduce the conversation of a mute individual. or someone we do not know. In contrast. Artificial intelligence takes this a large step further by stating that there are other possible media in which the same behavior can be realized. a crucial distinction between the religious view and some of the practical approaches of artificial intelligence.a family history. This difference is related to the notion of a universal artificial intelligence. which is conceptually similar to the model of universal Turing machines. when we met him or her. There is. or even an obsessed individual.the religious view is t ypically focused on the individual identity of an individual human being as manifest in a unique soul.Complexity estimation
777
tent with more primitive notions of matter. The Turing test suggests that in a conversation with a computer we may not be able to distinguish it from a human being. While we bypassed the fundamental controversy between science and religion regarding the presence of an immaterial soul. A key problem with this prescription is that there is no specification of which human being is to be modeled. We have discussed in Chapter 3 that our models of human beings are to be understood as nonuniversal and would indeed be better realized by the concept of representing individual human beings rather than a generic artificial intelligence. of which one possible medium is the biological body that the human being was implemented in. Viewed in this light. However. the statement of the existence of a soul appears to be the same as the claim of artificial intelligence—that a human being can be reproduced in a different form by embodying the function rather than the mechanism of the human being. A primitive concept of matter might insist that the matter of which we are formed is essential to our functioning. and interactions are of varied levels of intimacy. opinions and a personality. prior to the test. Instead. Thus the objective of creating artificial human beings might be better described as that of manifesting the soul of an individual human. Finally. We can illustrate this change in perspective by considering the Turing test for recognizing artificial intelligence. According to this view there is a generic model for intelligence that can be implemented in a computer. Human beings have varied complexity. There are common features to the information p rocessing of different individuals.however. This conflict is in the question of the
.

Specifically. Indeed.a finite complexity may be humbling and difficult to accept. an independent value to which these complexities can be compared. This result is quite reasonable but does not suggest any clear dividing line between animals and man. Philosophically.including themselves. Alternatively. As is often the case. We should recognize that this capability can be a double-edged sword.animal processes.the value of a number attains meaning though comparison. There is. a book might discuss survival under extreme circumstances rather than survival under more typical circumstances. The study of complexity presents us with an opportunity in this regard. One reference point was clear in the preceding discussion—that of animals. We found that our (linguistic) estimates of human complexity placed human beings quantitatively above those of animals. we may consider the complexity of a human being and see it as either high or low. by placing a number on this complexity it presents us with the finiteness of the human being. On the one hand it provides us with a scientific method for distinguishing man from matter. or whether it is biological scientists that consider the biochemical and cellular structures as the same as. Even so.an animal and a human being. Let us consider for the moment the complexity of the demands of the environment. The idea of biological evolution and the biological continuity of man from animal is based upon the concept of the survival demands of the environment on man. may find it comforting to know that this limitation is fundamental. At the same time. by recognizing that the particular arrangement of atoms in a human being. or the particular implementation of biology. the scientific perspective has often been viewed as lowering human worth. and primitive home or boat construction. basket making. the complexity of survival demands is small. however. Books that discuss survival in the wild are typically quite short. It is significant that an ape may have a complexity of ten times the com-
.there are no encyclopedias of relevant information. This suggests that in comparison with the complexity of a human being. the amount of text is not longer than a rather brief book. This is true whether it is physical scientists that view the material of which man is formed as “just” composed of the same atoms as rocks and water. We must have some reference point with respect to which we measure human complexity. achieves a functionality that is highly complex. Both the religious and popular view would like to place an importance on a human being that transcends the value of the matter of which he is f ormed. we use language-based complexity estimates throughout. tanning. as we might expect.and derived evolutionarily from.778
Hu ma n Civ il iz a tion I
intrinsic value of a human being and his place in the universe. While there are many individuals who have devoted themselves to living in the wild. 3 × 105 bits. and man from animal. We can estimate this complexity using relevant literature. For consistency. this complexity appears to be right at the estimated dividing line between animal (106 bits) and man (108 bits). Others who already recognize the inherent limitations of individual human beings.A quantitative definition of complexity can provide a direct measure of the difference between the behavior of a rock. For those who would like to view themselves as infinite. Such a book might describe more than just basic survival—plants to eat and animal hunting—but also various skills of a primitive life such as stone knives.

However. One might ask why they did not develop complex cultural activities. complexity does not. That of human beings is not.or alternatively. our results suggest that instinctive behavior is actually a better strategy for overcoming survival demands—because it is prevalent in organisms whose behavior arises in response to survival demands.A single aspect of their culture might occupy a book. for example. Several interesting remarks follow. Another way to arrive at this conclusion is to consider primitive man. We might ask about the complexity of their existence and specifically whether the demands of the survival are the same as the complexity of their lives.this does not rule out that general aspects or patterns of behavior. We may start from the microscopic complexity (roughly the entropy) which corresponds to the information necessary to replace every atom in the human being with another atom of the same kind. there is little reason to produce more complex instinctive behaviors. this does not hold. From books that reflect studies of such peoples we see that the descrip tion of their survival techniques is much shorter than the description of their social and cultural activities. The
. While they do have a social life. The explanation that our discussion provides is that while time would allow cultural activities. However. while the survival methods do not occupy even a single one. and for this reason human behavior is not instinctively driven. are driven by survival demands. Human behavior cannot be driven by survival demands if the survival demands are simpler than the human behavior. ifthe complexity of the demands of survival are smaller than that of a human being. While this approach has also often been applied to human beings—the survival advantages associated with culture. Thus. Of course. One of the distinctions between man and animals is the relative dominance of instinctive behavior in animals. or primitive tribes that exist today. any animal behavior might be justified on the basis of a survival demand. One might think. This conclusion is quite intriguing. In this context we can suggest that analyses of animal behavior should not necessarily be assumed to apply to human behavior.as compared to learned behavior in man. art and science have often been suggested—our analysis suggests that this is not justified. but a human being has a complexity of a hundred times this demand. In contrast to grazing animals. once such demands are met. at least not in a direct fashion. In particular. of sleeping lions. It is often suggested that human dependence on learned rather than instinctive behavior is simply a different strategy for survival. We now turn to some more practical asp ects o f the implications of our complexity estimates for the problem of artificial intelligence—or the re-creation of an individual in an artificial form. it does not compare in complexity to that of human beings. We might compare the behavior of primitive man with the behavior of animal predators. We might imagine that the computer could simulate the dynamics of the atoms in order to simulate the behavior of the human being. Thus. predators satisfy their survival needs in terms of food using only a small part of the day. to represent the atoms in a computer. the complexity of such predators is essentially devoted to problems of survival.Complexity estimation
779
plexity of the environmental demands upon it. then there is no reason for additional instinctive behaviors. We can argue instead that if the complexity of survival demands are limit ed. or even some specific behaviors.

if only 1010 bits are relevant to human behavior. which would be significantly smaller. In this picture. Moreover. there may also be another use of some of the additional large number of microscopic pieces of information. We have made no claims about our ability to obtain the necessary information for one individual. even if we chose to represent the information we estimated to be necessary to describe the neural network of a single individual. once this information is obtained. We have already suggested that there may be inherent limitations to the complexity that can be formed. One possible use of the additional information can be inferred from our arguments about the difference between TM with and without a random tape. At the lower end of this range. that ultimately affects human behavior. random motion of molecules affects cellular behavior.Our estimate o f behavioral complexity. it should be possible to store it. This fits nicely with our discussion of chaos in complex system behavior. Such a task is likely to be formally as well as practically impossible. we might ask what the additional microscopic complexity present in a human body is good for. 0. two hundred CD-ROMs is well within the domain of feasibility. Another
. this would be a technologically feasible project. 1010±2 bits.there must be a much larger reservoir of randomness. Since a CD-ROM contains 5 × 109 bits. then the computer must also simulate the environment. One central question then becomes whether it is possible to compress the representation of a human being into a simpler one that can be stored. we might ask why nature doesn’t build an organism with a complexity of order 1030.On our own scale. Indeed. Specifically. However. We can assume that most microscopic information in a human being describes the position and orientation of water molecules. suggests that this might be possible. Only a small number of bits can be relevant at any time. 1016 bits or 2 million CD-ROMs. If the simulation is not composed out of the atoms themselves. we are discussing 2 × 10±2 CD-ROMs. The implication is that the microscopic information becomes gradually relevant to the macroscopic behavior as a chaotic process. We might also ask whether we would know if such an organism existed. The discussion in Section 1. However. Before we discuss the problem of simulating a human being.7 suggests that it may be necessary to have a source of randomness to allow human qualities such as creativity. what are most of the 1031 bits doing? One way to think about this question is to ask why nature didn’t build a similar machine with of order 10 10 atoms. we recognize that in order to obtain a certain number of random bits.A computer that can simulate the behavior of this individual represents a more significant problem.9. then the complexity of the machine must be significantly greater than that of a human being. unless the system is constructed to respond to its environment in a manner similar to the response of a human being. but some controllable representation of the atoms. specifically the firing of neurons. This does not mean that all of the microscopic information is relevant. Even at the upper end. It must be assumed that any computer representation of this dynamics must ultimately be composed of atoms. However. This is one approach to understanding a possible use of the microscopic information content of a human being.780
Hu ma n Civ il iz a tion I
practicality of such an implementation is highly questionable. The problem is not just that the number of bits of storage as well as the speed requirements are beyond modern technology.02 CD-ROMs is clearly not a problem.

Complexity estimation
781
approach would ascribe the additional information to the necessary support structures for the complex behavior. Finally we can say that the concept of an infinite human being may not be entirely lost. but would not attribute to it an essential role as information. or the complexity of a human being with access to a library. we can consider the complexity of a human being with paper and pen. It may turn out that our quest for the design of a complex machine will be limited by the same fundamental laws that limit the design of human beings. the choice of a higher temperature may be required to enable a higher microscopic complexity. We can go beyond this argument by considering the problem we have introduced of the fundamental limits to complexity for a collection of molecules. The previous discussion is not a proof that we cannot build a robot that is more capable than a human being.can reproduce complex behavior limited only by the matter that is available. which at least in part will make use of biological molecules and methods. which also limits the macroscopic complexity. The mammalian body temperature may be selected to balance two competing effects. and possibly the use of superconductors. At high temperatures there is a high microscopic complexity.A way to argue this point more generally is that the sensitivity of human ears and eyes is not limited by the biological design. it is interesting that some of the modern approaches to artificial intelligence consider the use of nanotechnology. over time a human being. any claims that it is possible should be tempered by the respect that we have gained from studying the effectiveness of biological design. this argument suggests that it may not be possible to build a systematically more complex artificial organism.the complexity of a human being with a computer. For example. reproduce arbitrarily complex behavior. Since human beings make use of external storage that is limited only by the available matter.
. However. We have demonstrated time and again that it is possible to build a stronger or faster machine than a human being. In this regard we should not consider just the complexity of a human b eing but also the complexity of a human being in the context of his tools.in arbitrarily long time and with an infinite storage. which we will address in Chapter 9. However. However. This brings us back to questions of the behavior of collections of human beings. It may also be that the behavioral complexity of a human being at its own length and time scale is limited by fundamental law. We have already argued that the present notion of computers may not be sufficient if it becomes necessary to include chaotic behavior. As with the existence of artificial sensors in other parts of the visual spectrum. In this regard. However. through collaboration with other human beings/generations extending through time. but by fundamental limits of quantum mechanics. This has led some people to believe that we can also build a systematically more capable machine—in the form of a robot. Even the lowly TM whose internal (table) complexity is rather small can.One of the natural improvements for the design of deterministic machines is to consider lower temperatures that enable lower error rates and higher speeds. we already know that machines with other capabilities can be built. breaking the ergodic theorem requires low temperatures so that energy barriers can be effective in stopping movement in phase space.