Below you'll find the transcript of an interview
with Robin Milner, held in Cambridge on the 3. September 2003. The
interview was conducted by Martin Berger. Thanks
to Kohei Honda, Keith Clarke, Carla Benjamin and Tony Stockman for
their help and suggestions.

When and where were you born and into what kind of
family background?

My father was an army officer. I was born in 1934 in Plymouth on the
south coast and because we were an army family we moved about quite
a lot. I remember we ended up living in Scotland during the war, in
Edinburgh. I think we moved to Edinburgh when I was about five. I
lived in Edinburgh for three years, then I went to school in
Wales. This was a private school which was evacuated to Wales because
of the war and eventually it moved back after the war and so I went
back with it to Kent. It was a school where normally people would
pay a lot of money but we didn't have a lot of money. But I had a
kind of scholarship, because I was being trained so that I should get
a scholarship to one of the so-called public schools after that.

So your parents put emphasis on your academic
achievements?

Well, there was something like that. When I was
thirteen, I took a scholarship exam to one of the public schools, Eton College, that most people
know about. I got in, and that was the beginning of what you might
call a fast-track education for me.

Do you have brothers or sisters?

I have one sister, yes, she's older than I am. She
lives in Wales, she is retired now. She has a family, one son. He has
children. She is a widow, unfortunately. She looks after a herb garden
in Wales.

I was always more interested in maths than anything
else, but as you probably know in those days classical education was
essential in that sort of school.

Did you learn Greek or Latin?

I learned Greek from the age of ten and Latin from
the age of eight and I regarded them both as a kind of defective form
of mathematics because they were taught with such accuracy. But as far
as classical culture is concerned we weren't taught as much as we
might have been. In any case, I wasn't particularly interested in
history. But I could write Latin verse, because Latin verse was very
accurate and there is no doubt what syllables should be long, what
syllables should be short: as I say, a defective form of
mathematics.

Of course I was doing real mathematics too, and I
was allowed to do it full time, as it were, from the age of
sixteen. There was a good teacher at my prep-school, that is, when I
was eight to thirteen, whom I learned a lot from, simply because he
enjoyed it so much. There was at least one other at Eton College, I remember
he was in charge of all the rowing as well, but I remember he was
inspiring, he taught me projective geometry. And he found it so
beautiful that we found it beautiful, too. That was a good
experience. That took me up to the age of eighteen. Then I took a
scholarship to Cambridge. I went to King's.

It was around that time and two of the people in
my group went to Korea, but I didn't. I went to the Suez Canal. At the
time, the British were still in charge.

Must have been warm there.

It's quite warm and very dry. It doesn't rain
because it is so dry. I didn't enjoy the army particularly, but my
father was in the army, so I took it seriously.

Did your parents understand your fascination
with mathematics or did they just look bemused?

Oh no, they didn't understand mathematics or
scholarship in general, but certainly they supported me.

You could not go to your mother or father and
ask "please explain", say, "probability distributions to me"?

Oh no! No one in the family at all. But that was
fine, because I went to a boarding school. I lived there for eight
months of the year. So I had plenty of people to talk to. Because the
education system is very strange in this country -- we have these
private schools -- they are a hothouse, but one of the things that
you learn is to work extremely hard. The standard of how much work you
do, particularly if you are supposed to be doing that sort of thing,
as opposed to just being a sportsman, is very high. You learn that
it's natural to spend all night thinking about a mathematical problem,
rather than going to sleep. That sort of thing came naturally
there.

At Eton
College you didn't feel you stood out as being particularly
studious?

Oh no, because there was a group of about 70
scholars, who would be separate from the rest of the school and were
regarded as collectively mad by the rest of the school. So that was
alright.

Yes, I studied maths for two years. Because of the
exam I had taken, I did leave out the first year and I started at the
second year, which was probably a mistake because I had lost touch
after two years in the army. But I took the degree essentially after
two years, the main degree paper, although I had to be resident at Cambridge for three years. In 1956 I
went on a course on the EDSAC machine here. I
regarded programming as really rather inelegant. You'd write one
instruction after the other, and it was all rather arbitrary. It
seemed to me that programming was not a very beautiful thing. So I
resolved that I would never go near a computer in my life!

This programming course was part of your
university studies?

No, it was a ten day course in the summer. Of
course I had not realised what a tremendous achievement the digital
computer was. I didn't know about Turing at the time. So I didn't
know about the logical work. It was because of my own ignorance that
I didn't appreciate what was going on. From October '56 I changed to
philosophy. I took History of Philosophy. I also did a certain amount
of analytical philosophy at that time. I got interested in and puzzled
by Russell's theory
of descriptions and names. It's relevant to what we think about these
days. But when I left Cambridge,
although people suggested I might stay, I was tired of academic
things.

It seems a bit surprising that you could have
been in Cambridge in the 1950s and not be aware of Turing's
groundbreaking work.

Yes it does look strange, doesn't it, but this
groundbreaking work was very much in logic. Interestingly enough, the
computing people in Cambridge were not particularly interested in
the theoretical side of it.

They were not mathematicians?

No. The engineering achievement was the main thing
about the EDSAC computer, which was built in '49. That was a great
achievement, but the link between that work and Turing's work wasn't
really made very obvious. Turing had been a fellow at King's College,
but he died, I discovered later, a few months before I arrived. The
man who taught me maths at King's was a friend of Turing's. But I
didn't realise that because we didn't talk about Turing. So I didn't
know about him at all.

Your mathematical education was not really on
the logical side?

No, logic came much later, when I was about 29,
30. '63, '64. In '64 I began to learn about logic, about recursion
theory and all those things, while I was at City.

I went to London and did some part-time jobs. I
became a school teacher for one year in '59.

Primary or high-school?

This was in a secondary school. A so-called
grammar school. Then I left that, in 1960, after one year and went
into Ferranti, where I did programming for three years.

Despite your objections?

Yes. I didn't move into it with great enthusiasm,
but I realised that I should get a job.

It sounds like you were quite unsure at the
time.

I was. Completely unsure. At Ferranti I looked after
a program library.

That was in London?

This was in London, yes. I made sure that all the
new machines they were selling were actually working. I had a first
experience of writing a compiler or part of a compiler. Then I
thought, perhaps I should get an academic job, which I did: I went to
City University in 1963. And there I had a kind of double life: I
taught maths to engineers, which was very routine, but I also began to
get interested in AI. I was particularly interested in CPL, a
programming language inspired by Christopher Strachey, which
eventually led via BCPL to C which everyone knows.

You didn't have a PhD at the time?

No, I never had a PhD.

It was no problem getting teaching positions
without?

No, no problem. You must realise that City
University had until recently been a College of Advanced Technology
and it had just been made into a University. But in any case, at
that time -- and even now -- you don't have to have a PhD to get a job
in this country. We are fairly unique in that respect, but I was still
in the minority. I never had the experience of a PhD, and in some ways
perhaps what I did was rather more independent than if I had done a
PhD. I was not guided. I simply guided myself.

I got interested in relational algebra, particularly
because of databases. I wondered what the status of this was --
Tarski's relational algebra. Of course this related to Artificial
Intelligence; I wasn't interested in intelligence per se, but I was
interested in automated question-answering, perhaps question-answering
in relation to a database. That was an example of using algebra or
logic -- an algebraic logic really -- to formulate questions and
answers. I didn't get very far with that. What really sparked me off
was getting interested in program verification and what semantics
might mean.

When I went to Swansea in 1968 I took a research
job, I gave up teaching and became a research assistant with David
Cooper who was head of the department in Swansea. He had a small
group there, working on program verification and automatic
theorem-proving and semantics. That was at the time when Dana Scott
produced his famous domain theory. He gave a series of talks then, in
'69, and I went over to Oxford and heard him. That was very
exciting.

So, in some sense, it began to move very fast. The
idea of a machine proving theorems in logic, and the idea of using
logic to understand what a machine was doing ... this double
relationship began to inspire me because it was clearly not very
simple. The relationship wasn't just an inverse relationship "I
bite you and you bite me": the computer says: "I will automate your
logic" and the logic says "I will tell you what your programs
mean". These are two different relationships.

Of course you already have this in classical
mathematics where you formalise logic, but you also give meaning to
and reason about logic using logic. They claim there are these
hierarchies of (meta-)languages, but actually it's the same thing
really.

Yes, I suppose it's the same thing. I wasn't aware
of that of course. In particular, it comes to life with
computing.

Your interest in relational algebra, databases,
program verification, was it triggered by some specific problems you
needed to solve or was it general intellectual curiosity? Were these
questions in the air at the time?

Oh yes, these were things one could do to get into
the academic side of the subject. I wrote an automatic theorem prover
in Swansea for myself and became shattered with the difficulty of
doing anything interesting in that direction and I still am. I
greatly admired Robinson's resolution principle, a wonderful
breakthrough; but in fact the amount of stuff you can prove with
fully automatic theorem proving is still very small. So I was
always more interested in amplifying human intelligence than I am in
artificial intelligence.

That means I began to be interested in how one could
verify programs. I knew about Floyd's work. Tony Hoare was working
on that sort of thing, too. I got to know him during that time. Of
course I applied my theorem-prover to Floyd assertions. I generated
verification conditions from programs. But then it took ages to prove
them. My theorem-prover couldn't do it. I then thought "shall I prove
a real program"? So I got a chemist, I forget his name, he had
written a Fortran program in connection with his research, and I
thought, let's prove some things about this. I was fascinated by the
fact that the program was quite difficult; he was using Fortran
arrays to encode his own matrices, but he was embedding them all in
one big Fortran array for efficiency reasons. So you had this
interesting representational problem. The problem of verifying was a
problem of simulation or data representation, and I realised how big
a problem that was going to be.

In fact, out of that I got interested in simulation,
which I did a bit of work on. But only simulation between programs. It
wasn't a very mathematically profound piece of work, though Tony
Hoare was very nice about it. But the idea of something concrete
representing something more abstract was obviously important. So, what
the human does is to abstract from concrete representation, no
matter what that representation is. That's the essence of the
relationship between algorithm and program.

What was the next step for you?

Then I visited the United States during that period
once with David Cooper. We went to Carnegie-Mellon, and there was
Zohar Manna who had been working with McCarthy at Stanford. I spoke
to Zohar Manna about what I had been doing with program schemata and
verification. As a result I got a job with McCarthy from 1970, from
the beginning of '71, at the AI lab in Stanford.

The interesting thing there was, they were just
looking for somebody who would do something practical. Scott had
recently produced a kind of logic of domains, a logic of the hierarchy
of domains, a type-theoretic hierarchy, This to me seemed to be
something which was practical. That is what LCF came out of. Since I
was so interested in human-assisted -- or I would say
machine-assisted-- proof, this seemed to be a wonderful opportunity
for a program-oriented logic. It really took off for me when I
realised that I could write down the syntax of a programming
language in this logic and I could write the semantics in the logic. I
could also write down the semantic function in the logic.

That was a domain-theoretic semantics?

Yes. I could write the translator from some
trivial high-level language to some trivial assembly code. I could
write down all four sides of the commuting diagram which says: you
could do compilation and then take the semantics of the target code,
or take the semantics of the source language and encode it in the
target semantics, and you get the same answer. One of the first
things we did in LCF was to verify -- in principle verify -- that
commuting diagram. It was quite a big step. But it was much more
easy to verify the compiler than it was to verify a single computer
program!

In what sense do you understand "verifying a
compiler"? Does it assume that you have the semantics of assembly
language and the semantics of the high-level language?

Yes.

What did you use as semantics of your assembly
language? The state transitions of some microprocessor?

Yes.

So you went to a very considerable amount of
detail?

There was a certain encoding of the semantic objects
of the source language. It wasn't as though we were the first people
to think about it. The advice was "use algebra"! Essentially, we were
using algebra, the representation of one algebra in another, a lot of
the time in Stanford.

But meanwhile I got somehow interested, and I don't
know how, in concurrency. I remember that, without linking it to
verification particularly, I wondered about interacting automata. I
had an idea that automata theory was inadequate, because it hadn't
said what it was for two automata to interact with each other.
Except for the Krohn-Rhodes Representation Theorem, which said
something about feeding the output of one automata into another. But
there wasn't interaction between the automata.

At the time the internet was developed in
southern California you were in Stanford. And what about the work on
object-orientation? They were thinking about sending messages between
objects. Did that influence you?

I don't know. Maybe. I did know simulation
languages. I'd done some work, not published work, but I did
understand simulation languages back in '63, '65 because Simula was
invented in '63 or '64 or something like that. We had a lot of
simulation languages then. Simulation intrigued me because it was
about how you represent the real world inside a computer. I
remember being puzzled by how to define the agents in a simulated
process. For example if you have humans passing buckets down the
chain, then you simulate the humans, but shouldn't you simulate the
buckets in exactly the same way? From the bucket's point of view, the
humans are moving past the bucket, from the human's point of view, the
buckets are moving past the humans. So there was some extraordinary
looseness about what could be meant by representing a real-world
process.

If you approach things from the simulation
angle, then many models of computation must be inadequate:
Turing-Machines, Lambda-Calculus, Shannon's theorem ... They all are
extremely informal about what it means to get information from one
entity to another.

You are probably right, but knowing about simulation
languages must have been one of the reasons that I though automata
ought to interact with one another. Of course I didn't know about
Petri's work, which again began in '63. I didn't know that at all. But
what struck me later was that the great thing about Petri was that he
had actually worried about automata theory and what interaction
between automata might mean. Here is one transition diagram and here
is another transition diagram, but this transition in the left diagram
must coincide with that transition in the right diagram. And that
sharing-of-a-transition is how Petri represented
communication.

The intriguing thing about Petri's work is that he
was talking about how two automata could interact, and he then put the
whole into one Petri-net and he didn't do it in a modular way. But
the fact that he used this to represent office systems and real-world
information systems showed that he had set his sights really quite
high.

Wasn't he interested in formal models of physical
information exchange?

I think so, but he also applied his model to
office systems. I don't honestly remember when I got aware of his
work. Certainly, when I worked on CCS much later, I got aware of
it. In the late '70s, I visited him, but by that time he was much more
interested in modular semantics, because the great thing about
denotational semantics was that it was compositional in the most
accurate sense of the word "compositional".

The concurrency theory at the time was more of a
side-issue?

It was, but secretly I realised then that working
in verification and automatic theorem-proving was to me very exciting
but it wasn't getting to the heart of computation theory. What it was
doing was providing assistance to people who had real concerns about
correctness, but it was Dana Scott's work that was getting to the
heart of computation and the meaning of computation. I wanted to do
something which would carry that understanding into the wider
field. I was concerned not to only work with computer-assisted proof,
though that was going to be very important part of it, but also to ask
what the limits of computation may be. This interaction business began
to seem to me to be breaking the mould, breaking the limits what
people have been trying to understand in a mathematical way up to that
point.

Was it easy for you to communicate these ideas
to the community at the time? Something like the Church-Turing Thesis
must have been very powerful. I can imagine people saying "everything
is a Turing-Machine. What's the need for other models"? Did you
encounter this attitude or did people react with an open mind to your
suggestions?

Some people did. I went to a conference in '73 in
Pisa. This was a working conference. Italians were also very
interested at that time. So my first paper on processes, which I was
trying to do in a domain-theoretic way, was about that
time. Semanticists were trying to apply semantic ideas of a familiar
kind to concurrency, which was an unfamiliar thing for it.

Did you get a temporary or permanent position in
Edinburgh?

I got a permanent position in 1973. That was coming
back from the States, which was a good thing because we wanted our
family to be educated in Europe, not in America.

Before we go to Edinburgh, can you say a little
bit more about your own family? When did you get married?

Yes. I got married in '63. We had our children while
I was living in London, working in City University, between '63 and
'68.

How many children have you got?

I had three, but one of my children died. I have two
now, a son and a daughter, and each of them has a daughter. My
daughter lives in Scotland now. She teaches yoga and previously ran
a cafe which trains handicapped people. Anyway, we go up there quite a
lot, seeing that family. And we also see my son's daughter when we go
to Australia, which we don't very often.

Your son lives in Australia?

No, he lives here, you see. But he goes to Australia
to see her.

What does your son do?

He's a teacher, in Reading.

You didn't inspire your children to be
Pi-calculists?

I didn't try. I probably avoided trying in the early
days; on the whole I'm quite happy to leave it to people to discover
it themselves if they have any particular inspiration in that
direction.

I spent most of my time in university doing music;
as a student in Cambridge I didn't meet most mathematicians. Most of
my social life was to do with playing chamber music. I'm an oboist, or
I was then. I was wondering if I should be a musician. I was wondering
if I should train properly, late, and be a professional musician.

You played to quite a high standard.

Yes, I was quite good.

What kind of music does interest you?

Mostly classical music.

I mean within classical music: 20th century?
Bach? ...

Everything. Everything. I think there are wonderful
pieces of music from every style and there are awful pieces of
music. There are awful pieces of Beethoven and there are wonderful
pieces of Beethoven. I was listening to Dido and Aeneas the other
night; Purcell is brilliant, but not always. Bach is occasionally
boring and some 20th century music is wonderful, some Brahms is awful,
some Brahms is actually wonderful. I think you must recognise what is
really exciting despite -- or independently of -- the idiom.

Why did you not become a musician in the end?

I don't know. I think because I don't compose music,
I wasn't very good at that, although I understand about harmony very
well, but I wasn't very creative. Other things? I realised that I was
perhaps training a bit late. But I never wanted to do mathematics as
a hobby, so I decided to keep music as a hobby! I played in good
amateur orchestras.

You don't play anymore?

No I don't play anymore.

Do you get asked if there's a connection between
your formal work and your musical ability?

My answer is: yes, why not? Certainly mathematicians
are interested in music. Perhaps they are interested in the structure
of music. Whether mathematics has any relationship to what you would
regard as musical phrasing, or tone-colour or passion in music, I
doubt.

Or the other way round: is your mathematical
creativity linked to your musical ability?

I don't know. I think one has a certain idea about
what's elegant. And that's close to what you do though not exactly the
same thing, but I certainly recognise a lot of inelegant work,
both in computing and in music, including some of my own.

Semantics to me is about aesthetics to a large
degree.

Yes, although there are criteria, stern criteria,
for saying what is correct, what is mathematically viable. For
example, compositionality, in some rigorous sense, you do need
that. But beyond that ...

The Petri-net community says compositionality is
uninteresting.

They have in the past, but they were wrong of
course. They are changing now. But they were right about plenty
of other things, so I'm happy to say they were wrong here, without
being too critical. Petri nets were actually an extraordinarily
important development in understanding concurrency.

I seem to remember that Don Knuth once said his
early students were all musical while his later ones were into
extreme sports. Did you notice similar shifts?

I couldn't make a link with that.

Then you went to Edinburgh to take up a
professorship?

I took up a lectureship.

Why did you want you children to be educated in
Europe?

On the whole, we tended to find that American
children grow up too fast, particularly in California. And we had no
intention of living ourselves in America forever, so if we stayed
there, it would be living for ourselves. In any case, how we wanted
to live and how we wanted to bring our children up were connected
strongly, so we wanted to come back. Edinburgh was a good place to go
to. It just happened to have a job at the time.

What was your goal in Edinburgh?

I think I was doing these two things: I went on with
the verification work, but I also wanted to make concurrency the main
intellectual challenge, although the two were very close to each
other. Some first-class people came to work with me on on theorem
proving and the LCF stuff: Malcolm Newey, Lockwood Morris, Mike
Gordon, Chris Wadsworth. We did devise a system which itself didn't
immediately get applied a great deal, but Mike Gordon brought it to
Cambridge -- this was the LCF system. He started doing hardware
verification. And then one or two other people began to design
verifications systems, or rather systems to perform
computer-assisted proof, on the model of our system, particularly
Constable at Cornell with his NuPrl.

The idea was that you had a metalanguage -- which
we called ML -- in which you were expressing your proof strategies,
and that there was a rigorous discipline in the way that you could
steer your theorem prover so that it could not break the logical
rules. That was the pillar of our design. So much so that we would
refuse to be efficient if we would be breaking the rules. We wouldn't
encode some proof procedure in assembly code or directly in Lisp. It
would all have to go via the execution of the logical rules of
inference. So if you stayed late at night working out a
proof with the computer, you wouldn't have to think afterwards "gosh,
did I cut a few corners?", because the LCF system just wouldn't have
allowed you to do so! Of course, it had some features which would
help you to express your thoughts on how a proof might go.

Including proof tactics and things like
that?

Yes, The intellectual challenge there was: how do
you have a believable proof assistant, how do you structure the ways
in which we go about finding proofs, defining bigger strategies in
terms of smaller strategies and so on. That developed, but mostly
through the medium of other systems, like Mike Gordon's system here
in Cambridge. He designed his HOL Higher Order Logic system along
the principles of LCF. That was very nice. I was busy with that in
the second half of the '70s very much, but at the same time developing
my ideas on concurrency, and CCS came up about 1980, 1979.

Before we talk more about CCS, can you say a few
words about ML, one of the big consequences of your work on proof
assistants. When did you realise that you had a general purpose
language at your hands that might be useful in all sorts of other
contexts?

That was very interesting because we didn't really
try to do that. That was not entirely due to our insights. It was
partly due to Luca Cardelli who now works at Microsoft Labs here.
He was a student at that time of Gordon Plotkin's, and he took our
implementation of ML and implemented it on a faster machine, because
he wanted to process some formulae for his thesis, as far as I
remember, which was on concurrency. In any case, being a fast
implementer, within a few weeks he implemented ML on a different
machine and that happened to be the machine on which we could teach
the students. And somebody else -- I can't remember who it was --
decided to that they would teach ML to the second year undergraduates
on that machine. Through this sequence of events, we began to realise
that we did have something that was ready to be used generally. In
fact people began to produce different dialects of ML -- I forget what
the exact sequence of events was; that was all through the '80s, for
different machine-assisted proof systems. Including the French.
CAML came about at that time. The French, led by Gérard Huet, were
using it for machine-assisted proof as well.

Eventually, about 1983, we thought we would pull
together the different dialects and make one language out of
them. That's how Standard ML evolved. It took about 7 years, from '83
to '90. In the end it looks like a lot of hard work! What we got in
the end was a formally defined programming language, which is used a
reasonable amount; and it stands up as a paradigm of how you might
define a language rigorously. At the same time it is a working
language. It stands well in that way. But it's not used worldwide as
much one might hope or expect. I think that's because it is very
difficult to evolve a language at the same time as evolving its
semantics. Languages do evolve, in ways which don't sit easily with
formal semantics. We haven't discovered how to evolve semantics.

What do you mean by evolving programming
languages or semantics? Can you illustrate this with an example?

Yes. I'm not sure this is the best example, but
think of the type system of ML: there is no particular respect paid to
record handling. There have been a lot of theories about types for
record handling, very important type systems for record handling, but
it's not in there, in ML. The reason it's not in there is because we
didn't understand how to prove the theorems and how to get it right.
It would have been nice if ML could have evolved in the direction of a
type system for records; but making sure that these theorems hold,
making sure that "well-typed programs don't go wrong", was
difficult to do. It is a tremendous amount of work getting the
semantics and proving theorems about it anyway. It is rather
non-modular: if you change one bit, you tend to change other bits
elsewhere. Adapting a theorem, or checking that it still holds, is a
big task. That's not a very good example, but I'm sure there are
others I could think of if I had enough time.

If you look at something like Ocaml, which I
think is currently the most advanced dialect of ML, it is the first
such language that is pragmatically viable. You can use it do do most
things other than low-level programming. Are you surprised at what it
looks like, what's in it, what's omitted? Is it how you expected it to
be? For example you have modules and objects, although these two
concepts overlap significantly.

I think that's a wonderful story of how a language
can evolve and still remain very robust. But carrying rigorous
semantics through that evolution would be very difficult and that is
something we will have to solve for future languages. I think Caml is
a terrific success because it shows that a language which began as a
theoretical undertaking can nevertheless achieve what any engineering
language can also achieve.

But the most successful language of all, C, came
about because someone wanted to write computer games on some spare
machine, so it's not always through engineering efforts that languages
succeed.

That's true. But it's also not through semantic
efforts that good languages come about. We have to resolve something
there and I think that's very relevant to concurrency languages.

When did you stop being involved with the ML
effort?

I went on being involved. We produced the formal
semantics in 1990. I was very much involved up to that. We then
revised it in '97. That was a lot of work, but certainly not
full-time, modifying the semantics.

By that time my main effort was definitely in
concurrency, trying to understand concurrency intellectually. I'm
beginning to regard it as more of a modelling exercise than a language
exercise. We are modelling what happens in the world, like Petri
modelling office processes, if you like. We are not looking for the
smallest set of primitives that can make sense of computation. In
fact, we are in a state of tension: we are looking for a small set of
primitives, but they have to fit well with what goes on, not only
microscopically, but also macroscopically, with what goes on in the
world.

That was to me the challenge: picking
communicational primitives which could be understood in systems at a
reasonably high level as well as in the way these systems are
implemented at a very low level. To some extent I think we have
succeeded. In CCS and CSP the communicational primitives are robust
with change of level of abstraction to some extent. But still, the
emphasis ought to be on modelling what happens in real systems,
whether they are human-made systems like operating systems, or
whether they exist already. There's a subtle change from the
Turing-like question of what are the fundamental, smallest sets of
primitives that you can find to understand computation. I think
some of that applies in concurrency, like naming: what is the smallest
set of primitives for naming? So some of that applies. But as we move
towards mobility, understanding systems that move about globally, you
need to commit yourself to a richer set of semantic primitives. I
think we are in a terrific tension between (a) finding a small set of
primitives and (b) modelling the real world accurately.

Can you describe the development from CCS to Pi
in its historic timeline? What was your first concurrency formalism
and how did you change it? What didn't work? How was it perceived?

I think it's an interesting story. I was working
with what I called Behaviour Algebra in 1978. The set of primitives
included parallel composition, prefixing and those things. I wanted to
make this small set of primitives do everything or do as much as
possible, and to understand them semantically, preferably in Domain
Theory; but that didn't work. The idea of labelled transition systems
became very important because it was a replacement for what didn't
work, which was encoding things into Domain Theory. Because encoding
things into Domain Theory, the equivalences became either too rich or
too poor, I mean too big or too small. They didn't hit what I regarded
as correct notion of behavioural equivalence. There seemed to be
something inescapably missing in Domain Theory in respect to this. So
labelled transition systems -- following on from Gordon Plotkin's work
on operational semantics and bringing the idea of labels in --
became central.

But automata already had labels!

Yes, that's true, they did, and Keller had
labelled transition systems. The idea of a label being, as it were,
a vehicle of interaction became totally important. So it's hard to
know where it came from: Keller with his labelled transition systems,
or automata theory? Making it a shared action was already
present in Petri to some extent. But doing it algebraically and
compositionally, that was an important thing. When I gave lectures in
Aarhus in '79, that was essentially how the book on CCS evolved. What
you are asking about is how it developed. What people perhaps don't
know is that I was talking to Mogens Nielsen there and we were really
trying to make the Pi-calculus work at that time. I remember vividly
discussions in his office in Aarhus, and we couldn't make it work. So
CCS became the Pi-calculus without communication of channels.

You were aware that CCS had shortcomings? You
didn't think you had hit on the right formalism and later changed your
mind.

Absolutely not. But it seemed to me -- and later all
the more -- that it was good not to go the whole way, because there
was a certain nice, manageable modesty about concurrency in CCS --
same thing with CSP, really. You have two levels: you have data and
then you have that data being moved around, but you don't have the one
feeding on the other. You like to move the values around, but not
to move the means of movement. It was very important, I think, to
see concurrent calculi without movement. Because then you could see
how much could be done without movement. And the answer is: a
tremendous amount could be done. The CSP people have shown that. A
tremendous number of systems can be handled. And then you could begin
see just where the barrier lay, what things you couldn't
model. Anyway, whatever justification you could have for inventing
something without mobility, Mogens Nielsen and I tried to get it to
work for the label passing, but we didn't succeed. So then CCS became
what it was.

Was there a theorem that you proved that made
you think "OK, CCS is definitely interesting enough"? I admire that
you said "CCS is not quite right but it is still already very
interesting and we can use it to understand a lot of things". I would
have been unhappy with it and thrown it away. Any particular key
event, that made you think "wow, this CCS is jolly interesting, let's
run with it"?

Two key events: one is being able to prove that
behavioural equivalences were actually congruences. That wasn't
easy. I had been working on that previously. Getting the idea that
they were congruences made you feel that you are getting at some
new kind of essence, because then you could think of the congruence
classes as the "real things" that you were talking about. These were
the denotees of your semantics. That it was a congruence was important
because the primitives seemed to be themselves, rather just an
elegant set. If they had resisted this congruence proof that would
have been awful. That would have been the end of it.

The other thing was -- and I think probably this
happened not immediately, probably roundabout 1982 -- when we
discovered this logic ...

Hennessy-Milner Logic?

Yes. You could represent bisimilarity, which of
course came about from talking to David Park, and previously was
called "observational equivalence". Of course it was the same
relation, almost the same relation, but of course the bisimilarity
technology was terribly important. So, whichever one of them it was,
it turned out to be captured by the Hennessy-Milner Logic. The fact
that you could write specifications or you could have a very very
simple logic which captures that relation exactly ...

It is an infinitary logic, so in some sense it
is not simple.

Yes, it's an infinitary logic, it has summation,
infinite sums, but if you cut down, you get quite nice a finitary
logic, if you do certain things to CCS. It is not entirely tidy, I
agree, but it was close enough. I remember saying to some of the
Petri-Net people "look, because we now have a logic that matches the
behavioural equivalence, isn't that interesting"? I found they didn't
respond very well to that. In any case, to me it was one of the
events, which said --- bar getting a few things right and wondering
about this infinitary notion --- that we were on the right track.

One thing that strikes me about
process-calculists is their obsession with equivalences and
congruency. Other communities don't care about these things and they
don't understand why we do and we don't understand why they don't. If
you agree with my assessment, where does your obsession with these
things, your aesthetic feeling that they are important, stem from? A
working programmer never deals with truly compositional systems, they
always have the operating system in the background. Do you think it's
to do with your mathematical education? Because you have been exposed
to algebra?

Maybe. But also, in denotational semantics you have
the idea that you understand the meaning of a program and that the
meaning is a function of some kind.

That appears to be a mathematician's way of
thinking.

Oh sure, absolutely. But you also believe that by
doing it mathematically, you provide a firm foundation which will
enable people to go further, though they don't have any wish to know
it.

Could it not be the case that the world of
computation is not compositional?

Yes, but you have to push these things very
hard. Somebody will say to you: "of course it's not compositional,
look at the operating system" and then you can say, OK, wait a minute,
we have to make the operating system one of the participants, one of
the agents in this population of agents that are interacting. We
won't achieve compositionality until we've done that, until we make
explicit that agency, until we recognise all the agents that are
there. We must ask ourselves, how are they interacting? Is there a
sense in which a single program interacts with the operating system?
All of those things need to be tackled. It's almost as though we have
to prove that we can be compositional. Nobody will pay attention
until we have. Eventually people want, or I want them to want, to
be able to talk about a process in the same way that they talk about a
function. That's why I'm not interested in short-term gains!

About equivalences: you were originally thinking
in terms of weak traces? And through David Park you were lead to
bisimulation?

Oh no, definitely not. The original book on CCS,
in 1980, has something called observation equivalence and it has
something called strong equivalence. Strong equivalence, although it
was not defined in terms of maximal fixpoints, coinductively, turns
out to coincide with bisimilarity. That's because of the fact of
image-finite transitions. So strong equivalence coincides
exactly, as it turns out, with strong bisimilarity. But we missed
the coinductive proof technique.

Weak observation equivalence turned out not to
coincide quite, because it was defined as the limit of an omega-chain,
each member slightly finer than the previous. It turns out that we
were wrong to think the maximal fixpoint would be reached as the limit
of a decreasing omega-chain. It has to go to a higher ordinal. Apart
from that difference, and that difference shows up only in quite a
sophisticated way, we already had weak bisimilarity, but not of course
the bisimulation proof technique, which is so important.

It was because David Park was visiting Edinburgh, I
think in 1981. It was his sabbatical and he was living in my house,
reading my book and he came down at breakfast time when I was washing
the dishes and said "there is something wrong with this". And then I
said "oh god, what's wrong"? "Well, this isn't a maximal fixpoint"!
And I said "should it be?" or something and then we went for a walk
and the answer was: yes, of course! Not only do we have a coinductive
proof technique with wonderful gains, but we also have coincidence
very nearly with what's already there.

So when we went for the walk; the main topic was:
what should we call this thing? David wanted to call it "mimicry" and
I said "that's difficult to say, let's call it bisimulation". He said
"that's got five syllables" and I said, "it doesn't matter, people
will be happy with it". I named it, but he brought the idea. In fact
it was so close to what I had done. But the proof technique, and the
maths behind it -- it seems to me -- are very much better than what
I had in that first book, where I was proving inductively that things
were ultimately equivalent by proving that they were n-equivalent for
all n; that was an inductive not a coinductive proof.

Who is David Park?

He did work originally on program schemata with
David Luckham and Michael Paterson. They were famous for their work on
program schemata. This was before Scott, so they were looking at the
semantics of imperative programs, and looking at the decidability of
the equivalence of these things, under all interpretations of the
function symbols, and finding some very beautiful results. He got
his PhD at MIT, but he was English. Anyway, he was a great friend of
mine. We knew each other when I was in Swansea. Before we came
together again on this concurrency stuff, he had worked with
mu-calculus and maximal fixpoints. So he came prepared with the
maximal fixpoint view.

People knew and cared about maximal fixpoints
before bisimulations?

Oh yes, very definitely. Maximal and minimal
fixpoints. David was a great expert in the extraordinary richness that
you get when you have a maximal and a minimal fixpoint operator in
something like the mu-calculus. So he brought all that knowledge to
bear. To him it stuck out very strongly what I was doing wrong in
CCS. Essentially he came along with this thing and it fitted in. When
I wrote the book in 1990, I tried to tell the story about how this
fitted in and how important it was.

Of course he died quite early, around 1990. It was
important for me that he would get just recognition for this. CCS was
already designed, but this particular bisimilarity technique seemed to
be very important, at least for those of us who believe in equivalence
and congruence. The reason that that's so important is that by doing
that we are getting at some kind of denotation of processes.

Even now I am talking with Tony Hoare, who is much
more interested in the idea of what it means for a program to meet
its specification. We are now trying to reconcile the CCS approach
that regards denotations as congruence classes, and the CSP which
talks about set-theoretic denotations such as failures and traces
and so on, and has very nice ordering relations, so that if the
specification is larger than the implementation it means that the
implementation is correct. The ordering notion between processes is
of course the other main important thing in process calculi. I think
we want both; we want the notion of denotations, and perhaps they are
congruence classes, and we want the orderings or preorderings
representing improvement of, or refinement from, a
specification.

What is the inverse of these refinement
relations? I suspect it is related to transactions?

I suppose there is not a symmetry here. If somebody
states a specification, then what they are saying is: I want a program
to satisfy this and there would be many different programs that
satisfy it. To produce any one program that satisfies this is
satisfactory. Given that program, you could refine it further and it
would of course still satisfy the original specification because that
is the refinement ordering. But coarsening it might mean that it then
would not satisfy that specification that it originally did, because
you would coarsen it in a different direction. So coarsening doesn't
have the same pragmatic interest. I think that's a fair way of putting
it.

Of course one is often not interested in the
execution details. One sometimes wants to see a large chunk of
commands as an atomic blob, one wants to forget most of the
details.

Maybe you are talking about a different abstract
machine, modelling at a different level of abstraction.

Yes, but isn't refinement the inverse to that?
You have a command that says "send message from x to y" and when you
refine that, it becomes "send message from x to a, from a to b, from b
to c and from c to y" ...

I think that's a different kind of refinement. Say
you state a specification in logic. You say that all communication
must satisfy the following formulae. That allows a whole space of
programs to satisfy that. But they will all be talking about
communications at a particular level of abstraction. You've got the
idea of different levels of abstraction as orthogonal to the
question of refinement. Maybe some people will call abstraction
refinement, maybe there's confusion ...

Probably me who confuses things!

But I think everybody is confused about that.
Different levels of abstraction is not the same thing as refining from
a specification to an implementation.

You talked to Mogens Nielsen in Aarhus and you
published your CCS book. What came next? You were still pushing
towards mobility I guess? How did the transition to Pi come about? It
seems to me that the key steps were collapsing everything into names
and finding the labelled transitions that make this work. Is that
correct?

Yes. You know the paper by Nielsen and Engberg?

I have seen it being cited. I have not
personally read it.

It is in "Proof, Language and interaction" (MIT
Press), a collection of essays written for me in 2000, edited by
Gordon Plotkin, Colin Stirling and Mads Tofte. I can show it to you.
Nielsen and Engberg, in '86, wrote a paper called "A calculus of
communicating systems with label passing". The point is that they got
over one of the difficulties that Mogens and I had found. So they
contributed a substantial step towards the Pi-calculus in that
paper. They never published it then. Now it was published in that
book of essays. We (Joachim Parrow, David Walker and I) cited
their original report in our first paper on the Pi-Calculus. And we
put some kind of summary of what we thought they had contributed and
what we had added to it to make the Pi-Calculus. The interesting thing
is, as I remember talking to Mogens back in 1980, one of the things
that didn't fit was the CCS renaming operation, which is subtly
different from substitution of new names for existing names.

Because renaming can be infinitary?

No. It's the fact that applying a renaming operator
at the outside of a process is different from doing a syntactic
substitution of names throughout that process. As I remember, Mogens
and I didn't succeed in making the label passing work, and one of
the reasons was because of that renaming operator. Now if you look at
Engberg and Nielsen's ECCS, Extended CCS they call it, that operator
is no longer present; they explicitly omit it..

There may have been other reasons; I don't remember
why label-passing didn't work out in my talks with Mogens. In any
case we didn't make this big step towards the Pi-Calculus that he and
Uffe made later. Then, knowing that, Joachim Parrow, David Walker and
I worked very hard to try and get only one kind of name. Mogens and
Uffe had various kinds of name. We did quite a lot to simplify it
down. We experimented with the idea of only bound names in
messages. We tried all sorts of different things to make sure that
we weren't missing a trick. It took us about three years from about
'86, '87, to '88, '89 to get it straight. It was a matter of not only
cutting things out, but making sure that you couldn't cut any more
out, since we wanted it to be as close to definitive as we could. That
was an interesting process and I have kept a heap of memos that we
all wrote. It always takes experimenting with different possibilities
and there do seem to be quite a number of possibilities.

Did you have, as one of the possibilities, what
we now call the asynchronous Pi-Calculus, where you don't have output
prefixing and sums?

I think we wanted the sum because it gives you
normal forms, and it gave us the algebra for CCS. We were reluctant
to do without that. It seemed to me that keeping the sum, although
perhaps not utterly necessary, gave you simpler applications, simpler
illustrations. And it was in the tradition of CCS anyway. It didn't
seem to be safe to leave it out. I'm glad that people have done all
sorts of things since, indicating exactly when we need it and what it
does to axioms.

I find the story of the sum quite fascinating
because although these calculi appear to just have one computational
operation, data exchange (names for Pi), if you have unrestricted
sums, there's a second kind of silent communication that communicates
which summand is chosen. That is also what ultimately leads to
unrestricted sum being computationally more expressive. I always
wondered if it had occurred to you at the time that this additional
communication was happening.

The way I thought about it in CCS (never mind the
Pi-Calculus, because the problem arises just as much in CCS,
particularly the fact that weak bisimulation is not a congruence until
you take care where summation is allowed) is this: It appears to me
that summation is like the superposition of states and that
observation causes the resolution of a sum into one or another of
its states. For that reason it is a much more esoteric combinator than
parallel composition.

Of course input is a form of sum, just slightly
more well-behaved.

Yes, slightly more well-behaved. Yes, I think that's
right. I agree that there's some overlap here between summation,
parallel composition and input and so on. I don't know whether that's
fully resolved yet. There are problems still around, but people are
able to discuss them now in the context of the full Pi-Calculus. I
think it's good to go on doing that. And I must admit that I gain more
insight now that I look at graphical models, because the way summation
works in bigraphs is quite unexpected. It can mimic the CCS form of
sum really quite closely. The Pi-Calculus is a step towards more
spatially conscious models with regions and something more, almost
geometric, where we might get more insight into what summation does. I
think the Pi-Calculus benefits from some kind of graphical story being
told. Maybe we shouldn't go too far into that just now.

Who managed to find the labelled transitions
that pass scope and when? It seems to me that that must have been a
breakthrough.

I suppose so. The point is that it was always going
to do that, if it worked. Even when Mogens and I discussed it in the
first instance, that possibility was around. I don't think that was a
late discovery. I think that was an inevitable consequence of passing
labels and doing it in the freest possible way.

But in CCS labels don't have any internal
structure. That is very different in the Pi-Calculus. I don't know if
the Pi-Calculus was the first calculus to have a rich structure in the
labels ....

Do not confuse transition-labels, which have
structure, with names which don't have structure. I think the
Pi-Calculus was the first calculus whose labels have an almost
embarrassing structure. The fact that you had to have restriction as
part of the labels, that was very worrying. In fact that's what led
me later to look at these labels as contexts, because it seems to me
that there's got to be a story about when you need more structure in
the labels. It's almost as though we were very lucky in CCS that we
didn't need any extra structure. We had a little bit of structure: we
had the tau operation. That was all. The advent of the Pi-Calculus
indicated that more work had to be done on transition systems in
general to see exactly when labels should have structure. I guess we
still haven't got the answer.

Another thing that strikes me is that most
formalisms have reductions rather than labelled transitions. Now, with
chemical semantics, the presentation of the Pi-Calculus is much much
simpler than with labelled semantics. Yet the labelled transitions
came first. Why?

Oh, that's easy: you can't do behavioural analysis
with the chemical semantics. By the way, chemical semantics is
terrific: it doesn't supply labels, but it supplies structural
congruence -- in the original form in fact. But I think the strength
of the labels is that you get the chance of congruential behaviour,
because the labels encapsulate what it is that an agent contributes
to a reaction; not just whether it can react, but what reactions it
could conceivably take part in, if only somebody else would do
something.

It is a minimal representation of that!

A minimal representation of all the interactions it
could conceivably take part in and what it could contribute to
them. That's the intuition of why they supply congruences. That to me
is very simple. You need this notion of what an agent contributes.

Labelled semantics does two things: it gives the
semantics of the raw processes, the computational steps and it helps
you reason about the congruences. Reduction semantics just does the
first thing and then you have some horrid rules about congruences and
you need some nice tools to reason about this.

I agree.

The concept of names and naming is very
important in your work. When did it occur to you that this is a
fundamental notion in computing? That you have these points where you
can interact, that you can hide. Did you have a specific moment of
revelation or is it the result of years of research?

I don't want to be wise after the event, but I think
it was when we found out that you could encode data as processes with
name passing.

That was when the Pi-Calculus already existed?

I'm not sure, you see. I think we probably felt that
we were going to be able to do this. That made us confident that you
can get all the data, not by means of other kinds of objects, but as
processes. And the way you access the data is via interaction and the
interaction is via names. I don't know at which point it would have
happened, possibly in our memos we wrote (and kept) over a period of
about two or three years, we may have something that suggests that
it's going to be OK just to cut down to names because we going to be
able to get data from processes. I think it was quite an early thing,
because I don't think we we would wanted to put the Pi-Calculus
forward without knowing that there was going to be a story about
data. So, although we did the story about data later, probably not
first paper, we said something about it in our first paper.

Did it ever worry you that your names were pure,
had no internal structure, whereas all the names in computing
applications heavily rely on internal structure?

No that didn't worry me because all practical
computing has to build towers of structure in order to get something
useful. What is interesting is the role of matching and mismatching. I
just wrote a little paper called "What's in a name?" in honour of
Roger Needham, who died earlier this year; around 1990 he wrote a
paper on pure names, from the point of view of operating systems. Do
you know that paper? ("Naming", in Distributed Systems (ed. Sape
Mullender), Addison-Wesley 1993).

Yes.

That was written more or less at the time of the
Pi-Calculus. So I thought I would write a paper to see what you could
do with pure names. And I conjectured that the Pi-Calculus is doing
something like all of the things you can do with pure names. You can
create them, you can use them to call, you can use them in what I call
a co-call, so communicating is two things going on: you can call on a
name and you can co-call on a name. Co-call is the negative, call is a
positive, if you like. And you can test names. And synchronised
action is the coming together of a calling and a co-calling. You can
tell a story which says: this is perhaps all you can do with pure
names. Is there anything else -- that's a challenge -- you can do with
pure names? The Pi-Calculus could be regarded as just supplying a
minimal environment to allow you to do all the things with pure names,
that you could imagine doing with them. This paper is about six pages
long and I'm telling a story which other people would recognise; it's
nothing particularly new.

I think that pure names are now seen to occupy a
terribly important place in the foundations of computing. And that's
not just because of the Pi-Calculus, the operating systems people know
about it as well. In that paper I include an example of Roger
Needham's about using composite names in, say, directories. It
shows that composite names, at least in simple cases, can be
represented by a mixture of use of pure names together with matching
and mismatching. So I think that the control structure of the
Pi-Calculus together with pure names actually gives you what you need
for composite names, but it would be hard to prove that.

One of the main uses for composite names is
efficiency, because when you pass a composite name, you communicate
not only the point of interaction but also, in some sense, how to get
there. This brings me to the great schism in the theory of computing
between semantics and complexity. There's currently virtually no
connection between the two. Did you ever try to combine the two? Do
you think that in the future, when the two will have been reconciled,
all the nice mathematical structure that you and your colleagues have
developed will survive in a recognisable form?

I'm beginning to think that it's perfectly OK for
complexity theory and modelling, or semantics, to be two aspects of
computation theory that are not necessarily mutually explicable. In
other words, they're independent. I don't think that we will make a
breakthrough which consists of uniting them, by changing some of the
primitives from one side or the other. Shannon had a quantitative
theory of information, and that does not seem to tell us about
structure at all, about structure of communications. It simply tells
us how few or how short messages can be and that's very exciting, but
I don't see any reason why we should expect these things to come any
closer to each other.

I disagree. For a start there are typing systems
that control very tightly what kind of recursion you can have and that
has ramifications for complexity. And intuitively, as a programmer,
complexity is part of the semantics of a program: if popping an
element from the stack takes n^3 steps where n is the number of
elements on the stack, it's not a stack, it's nonsense. If I'm a
customer, if I pay a million pounds for a program that flies my
airplane, I want a proof: not only that it works correctly but
also that it computes all its results in time. I certainly hope that
the two will eventually be merged, it may just be very hard to get
there.

Maybe I was too strong. Being able to create types
which have implications for complexity is important, but that
doesn't mean that complexity theory is then taken over by
semantics. It's more that we can fit ourselves into certain categories
or complexity classes by various means. I don't know enough about
logics for complexity; descriptive complexity --involving specialised
logics-- is well advanced. Maybe the logicians will refute what I was
saying. I don't feel that it's high on the priority list for us to
try to do the union. The reason is that we have so much structure
we're trying to model, particularly with the internet and the
populations of machines -- even nano-machines, if you admit
nanotechnology -- that we have to try to grasp. It's like algebra has
not much to do with complexity. Algebra is about structure. We have to
grapple with structure, perhaps as a higher priority. Perhaps I should
retreat into that position.

You mentioned Shannon's theorem and you said
that the key primitive is calling and co-calling. To me all this is
about communication. Communication is the key thing that happens.
What strikes me about these process-theoretic models is that
communication isn't directly captured as an entity with structure. It
is in the formalisms, but there is no algebra of communications. We
can say a name is this or that, a process is this, but there doesn't
seem to be a corresponding account of communication. Have you ever
wondered about this?

My first reaction is: I don't see why there should
be an algebra of something in order for it to be a primary entity. The
clue may be that, since communication is about interaction, in order
to make a full story about communication you need to have some notion
of the structure of actions which sits alongside the names that
represent the contiguities of agents. What process calculi are trying
to do is to say something about the structure of actions as well as
about the elementary interactions. And you can't say much interesting
about the elements of interaction which are just the names, unless you
have structure of actions as well.

I'm agreeing with you, you can't use names without
also having something else. If you're going to have a nice theory, you
got to have something else around. So what process calculi try to do
is have the minimal amount of extra around to get a good story. I
agree with you, it's puzzling. But that's rather beautiful. You can't
get at the names. They are a little bit like quarks. You can't
separate out the names from the things which are naming.

I'm happy to press ahead with what you may call
phenomenological theory. To look at the phenomenon of names and then
see if we can tell nice stories about names in various different
calculi. And then perhaps we get to know more as we look at the
different calculi. For example it comes up with all the geometrical
calculi, spatial calculi. Names seem to represent the contiguity of
these things. So we may get a better feeling for names from that
slightly different geometrical presentation of actions.

All your published calculi feature
point-to-point communication. Have you thought about other forms like
broadcasting or the more wave-like communications that we seem to be
seeing in physics?

I always though broadcast would be a derived
operation.

I'm not sure. There are separation results that
seem to suggest otherwise.

If I remember their work correctly, Ene and
Muntean have shown that broadcasting and point-to-point are separated
by some reasonably natural conditions.

That's interesting. It always seemed to me that
point-to-point was a more modular notion, a more controllable
thing. Broadcast seems to require some medium within which your
message is floating even though it isn't accessed. It always seemed to
me important that there should be be no place where a message is
floating, unless that place is itself modelled as an agent. So it
seemed to me that point-to-point captured that particular attitude
more directly. But, point-to-point being just two taking part, I did
work with synchronous CCS which had prime names; each agent could
be synchronising several prime names, essentially requiring several
things to occur in a single interaction. That seemed to me to be
rather nice. But somehow synchronous CCS doesn't seem nearly as useful
as CCS.

The other thing is that CSP talks about engaging in
an action as many agents as possible. Any number of agents can
engage in the same action. CSP has had lots and lots of
applications, but I'm not quite sure how much they have taken
advantage of this possibility of many agents engaging part in the same
action. I'm really rather puzzled by all that and I seem to have taken
the path of two participants. Others may take other paths.

Some recent developments take Pi-Calculi into
maybe expected, maybe unexpected directions: I mean modelling
biological interaction on the DNA level in terms of Pi-Calculi. Did
that surprise you or did you always think that your models are more
general, that they don't just talk about conventional computation?

It took me by surprise when I first heard Ehud
Shapiro at the Weizmann Institute. He came and gave us a talk about
it. Two things struck me. First of all he really needs something
more spatial. So I suggested to him that he used Ambients. Which he is
now doing. In fact, Cardelli is now working with Shapiro. Cardelli
has invented the term "Biograph" which represents the application of
Ambients to biological phenomena. What my Bigraphs are trying to do is
to combine Ambients and Pi-Calculus. There is a coming together of
these things. Certainly I was initially surprised and then I realised
that it was perhaps the geometrical thing, the spatial thing, together
with some mobility that really is what the biological people need. I
don't know what combination of these things is best for them. But
certainly I think there's a lot to be gained by looking for it.

You think it will not just be the biologists who
will get a better tool to work with but that we will also gain novel
insights from their modelling efforts?

We must look at what the biologists need and say:
they are using some of what we do perhaps, but is there something that
they want which we are not giving them? Cardelli thinks so. He talked
to me that other day about a new class of calculi where action takes
place on the membrane boundaries themselves, rather than within or
outside the cell. So it may well be that we get very strong insights
from them. What we should be doing is trying to join it all up all the
time; they may get insights for our possible or actual
applications, and we get them from theirs as well. We must try to keep
the whole thing under control to make sure that we do something which
is as relevant as possible to both without being prolix, in other
words, without creating something which has a lot of bells and
whistles. We are trying to find fundaments. We should always be open
to different applications, in case they can help us focus on a better
fundamental notion.

Something that has always puzzled me is how
domain theory became so dominant as a paradigm. I suspect that it's
because of Scott's amazing result that tempted a lot of people to work
in the domain theoretic way, partly because he reuses a lot of
conventional mathematics that people might have been familiar with. I
suspect that in the late '60s there was a lot of work on formalising
semantics and people were going in different directions and then Scott
came along ... Do you agree or disagree with this?

Yes, I think domains, which Scott also called
"Information Systems", are to do with how much information you need in
order to do such and such. Somehow this was very satisfactory,
because it gave you a nice theory of function calls and interaction
and higher-order functions. All of those things could fit together
in the notion of finite quantities of information.

What I didn't understand is why we couldn't
immediately transfer that domain theory to work properly for
concurrency. The first thing that I did in '73, '74 was to try to
represent non-deterministic processes by powerdomains. In fact, I had
talks with Gordon Plotkin about powerdomains and Gordon eventually
produced the right definition of them. By itself, this gave no
account of concurrency. But nonetheless, one could then say "OK, if
I have powerdomains I can represent non-deterministic processes, and
perhaps I can also represent concurrent processes by particular kinds
of domains". So I looked at domains of processes and looked at the
Plotkin powerdomain and it produced an equivalence which was too
strong a congruence. And I looked at the weaker powerdomains, the
Hoare powerdomain and the Smyth powerdomain, and they produced a
congruence that was too weak; I'm certain that these were accurate
statements. What I couldn't do was to get something that was
operationally both sound and complete, that captured what I regarded
as operational equivalence.

So, without understanding why domains weren't
working I said "well, let's look for something else". That's where the
notion of observational equivalence, or what subsequently turned into
bisimilarity, came from. And I still don't know why domain theory
didn't supply what was needed. Possibly a clue is that domain theory
failed in one other way: it failed to solve the Full Abstraction
Problem for PCF. In fact that seems to have been left to Game
Semantics to do ...

And now Pi!

Yes, but it's the Game Semantics that does it. Pi
can then tell the story of the solution, it's the notion of
interaction, game-theoretic notion of interaction -- am I not right? --
that gives the key to the full abstraction theorem.

Our Full Abstraction theorem is entirely in Pi,
no reference to games whatsoever. You take the Pi-Calculus, the
conventional encodings of Lambda-Calculi into Pi and apply a typing
system to root out exactly those processes that could break Full
Abstraction.

The initial papers by Abramsky, Jagadeesan and
Malacaria and by Hyland and Ong, I think, were talking about
games. More recently this was connected to Pi.

Of course. The games people were definitely the
first.

I must say I don't know the story terribly well
since then. What they indicated was something that domain theory
didn't have, which was to do with patterns of interaction. Now
that's precisely what we're into in a big way with concurrency. Maybe
for the full answer we will have to look for this development of game
theory and connect it to the Pi-Calculus. But I'm not sorry to have
left domains behind or at least adopted other ideas for a time,
because we were able to look at the operational phenomenon more
directly. I think that's been more important for a while.

If you look at programming languages that are
really used, C, C++, Java. They don't contain much that wasn't around
in the late '60s. Does that surprise you? Does it disappoint you?
People who buy programming languages seem to be happy to ignore all
the lessons the should have learned from semantics. Or hasn't
semantics produced anything really worthwhile? Or is it something you
don't care about?

I do care that languages should be informed by
theories. But they have been partially informed; Ada had
something like the CSP or CCS version of communication, the "alt"
command representing waiting for one of several different
alternatives to happen, so you could see parallel composition
sitting in there behind the scene if you look at it with a benevolent
eye. So there was an influence, particularly of CSP on Ada. And they
both had an influence on the Lotos specification language.

I actually think the best way forward for us now is
to look at concurrent calculi as modelling theories for modelling
interactions, whether they occur in programs or in outside
programs. We should try to widen the scene. I'm hoping that things
like ubiquitous computing and nano-technology will make it absolutely
necessary for us to have theories for modelling populations of
interacting agents with different mobile structure, because then we'll
be able to see that modelling is necessary in order to understand
those populations.

Languages should emerge from that. They should be
treated as a part of a modelling theory. Up to now I don't think we
had sufficient incentive to make sure that our languages are close to
scientific models. It's only with the onset of computation as a
global phenomenon that modelling those interactions becomes so
scientifically important that it is bound to have its effect on
programming languages.

I see it as unfortunate that languages haven't been
influenced. For example, type structures could have avoided a lot of
the Year 2000 problem. We know that nothing serious went wrong, but in
the two or three years before that people didn't know that it wasn't
going to go wrong. If they had used types, which had been around for
two decades, they would have been able to know that certain things
couldn't go wrong. The theories were around, which would help
people to know that certain things couldn't happen, barring physical
accidents or bad implementations of course. We are used to the idea
that theories can sit around and not be used. We just have to accept
it.

Any particular reason, anything about the
semantics community that prevents dissemination of our results? Or
does it just take 20 years to go mainstream?

I think it's only when we move to concurrency that
we have enough to claim that we have a theory of computation which is
independent of mathematical logic or goes beyond what logicians have
studied, what algorithmists have studied. We're only two decades on
from that. We're gaining confidence, so we can say that this could
possibly be a model suitable for the world. We must push that. We must
get into nano-technology, into ubiquity and say: not only must we try
model these, we must be prepared to develop our models further, so
that they do fit those things. In other words become actual scientific
models in the sense of what really happens, rather than modelling what
happens in our clean, abstract world.

We were lacking in confidence, I think we are
gaining in confidence. But we have to push it outwards. That can
only happen through modelling.

Can you tell me who your PhD students were and
what they are doing now.

There are twentyone. Nineteen have completed and twentyone
altogether. Who are they all? George Milne worked on processes before
we ever got around to process calculi, and is a Professor in
Australia. Avra Cohn and Raymond Aubin worked with me on LCF.

That was in Edinburgh?

That was in Edinburgh. Avra married Mike Gordon and
is busy with her family; I last heard of Raymond at Concordia
University, but he may have moved on long ago. Mike Sanderson worked
on recursion in CCS, and now teaches at Essex. Alan Mycroft worked on
ML but especially on abstract interpretation, and is here in
Cambridge.

Luís Damas did polymorphic type inference in ML, and
I believe is in academic computing still in Portugal. Brian Monahan
also worked with LCF; he's now a senior researcher with HP Labs in
Bristol, working in the Trusted Systems Lab. Kim Larsen worked in
concurrency, on a species of higher-order transition system; he now
leads a group in model-checking (UppAal) at Aalborg. Kevin Mitchell
implemented CCS on an abstract machine with a network of processors;
that's interesting, because he had trouble in expressing just how the
implementation matched the original. (The answer was invented later
by Joachim Parrow: coupled simulation!)

KVS Prasad is at Gothenburg where he designed a
successful calculus of broadcasting systems (CBS). Mads Tofte did
semantics of Standard ML and also provided the theoretical basis for
polymorphic types for references; he is now President of the IT
University in Copenhagen. Faron Moller worked on axioms for
concurrency, and the unique decomposition of processes; he's now a
Professor in Swansea. Dave Berry worked on a generic form of
operational semantics, and now works in the National e-Science Centre
in Edinburgh.

Is that the Berry of the Berry function?

No, that was Gérard Berry at INRIA. Chris Tofts
worked on proof rules for concurrent programs; he later applied CCS to
ant colonies and continues to apply process calculi and logics to
biology. Davide Sangiorgi designed higher-order Pi-Calculus and laid
its theoretical foundation; he's now a professor at Bologna. Peter
Sewell proved that bisimilarity of finite state processes isn't
finitely axiomatisable (these negative results are hard!); he now has
a Royal Society Research Fellowship in Cambridge, running a group on
validating heavy-duty protocols. Dave Turner worked on sorting in the
Pi-Calculus, and designed the abstract machine for Pi which later
(with Benjamin Pierce) led to their definition of Pict.

Alex Mifsud worked on Action Calculi and control
structures, went back to Malta, and recently came to UK on a
short-term work assignment but I've temporarily lost track of him.
Jamey Leifer worked with me on the maths of contextual labelled
transition systems, which now underpins Bigraphs; he's currently at
INRIA in Paris. Those are all the people who have finished. Two
people are coming up to finishing; Alistair Turnbull whose topic is
MIN, a linear graphical abstract machine-cum-calculus which can
underpin threaded languages like Java, and Ole-Hoegh Jensen
(collaborating with me by phone from Aalborg) on Bigraphs, especially
how they model both Pi-Calculus and Ambients.

That's all. Most of them are still in
computing. Some of them are very well known; I'm proud of all of
them. I was already 40 at the time I started, didn't have ten years of
supervising that some people have. Also I never found it very easy to
supervise more then two people or three people at once.

What is your style of supervision? Is it
laissez-faire or do you give them theorems to prove?

I need to be closely involved, but I don't give them
theorems to prove. I work with people. I don't feel comfortable unless
I know pretty much what's going on. I don't like it if somebody went
too far away from what we talked about, because then I don't know how
to judge or whether his theorems work or not. That's why I don't want
to supervise more than two or three people at once. But of course I
had some splendid interactions with those people and it is still
continuing.

What were the major institutional obstacles or
helper in your career? Did you find it easy to get funding? Or was it
easy getting funding twenty years ago?

It was always OK. I never had a grant application
turned down.

I can't say that for myself.

I've almost always had two research associates
working at some grant or other. Not now any longer. I'm dating from
1973 when I started with the LCF work in Edinburgh. I always had
perhaps a couple of students and a couple of research people with PhDs
already. So I always had a good five or six people working on what I
wanted. I have been very lucky in that way. It makes me sympathise
with people who sometimes get grant applications rejected. There is
a lot of competition. I think it is true that perhaps if they get to
know you they will go on funding you. I started at a time when the
competition perhaps wasn't so great. I didn't have any
difficulties. All my difficulties have been created by myself.

Presumably being a fellow of the Royal Society
and a Turing Award winner might have helped?

I haven't benefited any more since that. I've always
had the funding I wanted. Certainly people accept that I must have
done something interesting. But that doesn't help particularly,
because usually it's people who don't know what it is anyway. I never
seem to have benefited with respect to my immediate colleagues since
that -- they're as critical as they always were!

The other day a physicist described computer
science as an auxiliary science, not a real science to me. When you
started computing really was a new field. Did you find that other more
established fields looked down on your discipline or didn't take it
seriously?

It's not been a problem, except sometimes with
perception within the university. But then Edinburgh was always very
good in this way. It was seen that computing was very
important. Cambridge also. That wasn't too much of a difficulty. I
think the view of other scientists sometimes takes one's breath
away. Physicists clearly don't think of anything we're doing as
science. Probably what they mean is: it can't be a science because we
are always making things. What are we doing science of? We are doing
the science of our own constructions. But I think the boundary
between understanding one's own constructions and understanding the
world is breaking down. You've only got to look at bioinfomatics. In
chemistry, in chemical engineering, there are structures which we are
making which deserve to be understood by the same models as we
understand the natural world.

Ultimately physical theories must be able to
describe physicists doing physics, i.e. itself, because physicists and
physical theories are part of the natural world.

That's an interesting thought.

But they are very far away from being able to do
that.

I do think we have some problems. I was
influenced by Herbert Simon. He talked about the "The Science of the
Artificial". He does point out the difficulty that our science changes
as we do it. We could not conceivably be talking about populations of
interacting agents unless we had the internet and nano-technology
suggesting them to us. We don't have big enough imaginations, or it
wouldn't have been worth starting. It becomes something worth
starting. Once it's worth starting we can see that it might relate to
the natural world, so we can actually use this blurred boundary to be
able to model both, what goes on in the real world and what goes on
with what we make. The internet is both a natural phenomenon just as
much as it was built. Nobody built the internet, in some sense. In
some sense it happened by means of a lot of contributory activity.

But compared with physicists we have a problem
with verifying our ideas. Of course we can determine whether something
is a correct derivation in ZFC, but are our definitions right? It is a
difficult problem.

Yes, it is.

Is it just aesthetics? What have your scientific
controversies been about?

Say we worry about compositionality. We talk to the
Petri-net people. In the past they didn't represent it. They said
"we can model the whole of a system". Then we said "can you model
the system that I'm building now and then model how that would be
synthesised with some other system"? They said "we come along
afterwards and model the whole thing by a big net". Then the
controversy gets resolved by what works well in practice. Our models
seem to have to be justified by pragmatic considerations. Take
types. They get justified by certain errors that don't happen in
programs.

There does seem to be quite a lot of mileage in
justifying a theory because of how it helps build things. I wish I
could say this has the same purity as verifying a models against the
natural world. What we don't have is repeatable experiments. That's
the real weakness. We have to be able to validate our theories even
though we don't have the luxury of repeating our experiments.

This predicament is shared with many other
sciences, economics, sociology ...

And they are all regarded as second rate equally by
the physicists. I don't know how we can get out of that. Unless again
we do model something in the real world. Possibly in Biology we may be
able to. At the moment Shapiro at the Weizmann Institute is verifying
his models against what is happening in practise. If you equip
Bioambients or the Pi-Calculus or whatever, with probabilistic
transitions, then you can actually run the abstract model and compare
the physical happening with it and see whether that really agrees. I
don't know enough to how much credit to give to it, but Shapiro says
he does get agreement. So insofar as we are able to model real live
things we perhaps can validate against the real world.

It is also the case that programming languages
are getting better and programming productivity is above what you
would get in C.

Yes, and also a chap called Howard Smith who wrote a
book called "Business Process Management -- The Third Wave". It's the
philosophy behind the language BPML (Business Process Management
Language), designed and implemented by Assaf Arkin, who works for a
for a small company called Intalio, in association with BPML.org, a public domain institution which tries
to understand business process modelling. He based it on the
Pi-Calculus. Assaf Arkin knows a lot about the Pi-Calculus. Howard
Smith and Peter Fingar, who wrote the book together, say that as a
result they get a better model of business processes than before. He
claims, because of Arkin's work, that it does take its ideas from the
Pi-Calculus, and that they have been able to do things they could not
do before. It seems very promising.

I don't know very much about business
processes. Greg Meredith at Microsoft was also in on the beginning
of this. There again you can get validation but not in the pure,
repeatable experiment form. No sooner we create these things and put
them into business process modelling languages, than people will
redesign their business processes, so we are no longer able to be
talking about the same thing. We have to find some way of telling a
story about this as a real validation even though it isn't one of the
pure kind found in physics. I don't know how to do that.

You worked for a computer company and you
probably got involved with Microsoft through the new Cambridge
computer science building. What are your views on collaboration with
industry? Good thing? Bad thing?

I think it's necessary. I don't want to collaborate
on products, but we are quite lucky to have something like Microsoft
Lab with ex-colleagues and people like Tony Hoare and Luca Cardelli
and Cédric Fournet. These people are clearly working on abstract
topics which we share, but they are also working with the product
division, probably doing things that we don't know as well.

There is some sense of ideas flowing along that
chain, from the product people in Seattle, say, through these
researchers here, to me and to our students and going back again. We
don't know what goes on further down the chain always, but we are
able to talk to people here. Fournet is doing some very interesting
things, only half of which I know, but we are still able to talk about
the ideas behind them.

Similarly with Cardelli, I don't know how much
they are applying his work on Biographs, but he certainly has
relationships with the program division that I don't know about. On
the whole I'm happy with that. I don't think I would be happy to be
closer to industry. I think there are long-term ideas about what is
technically viable that you can only work out if you have time to
spend, if you haven't got industrial pressure on you. I would rather
be an academic so that there is no question of having to survive that
kind of pressure.

What do you make of the increasing patenting of
ideas in computing?

It's terrifying. It's ridiculous and terrifying at
the same time.

I agree. I wouldn't be surprised if there was some
patent application of some minor variant of the Pi-Calculus or some
typing system. Is that something you discuss with your peers when you
have your government advisory meetings?

I haven't done recently. I am aware that people
ought to be doing this. I signed a petition the other day which went
on behalf of several European academics to the European
Parliament. But I must admit that we are interested in ideas that can
be patented and that enormous damage could be done by an invention,
for example in nano-technology, that could conceal some kind of
misunderstanding. I think we should fight, but can't claim to have
done much fighting myself.

Why is the Pi-Calculus called Pi-Calculus?

Very good question. You start from Lambda,
Mu-Calculus already exists, you go through the Greek alphabet. Nu was
not very easy to pronounce and anyway it has another perfectly good
use now, so it was good to avoid that. Xi, I think comes next: I
wouldn't want to call something the Xi-Calculus because its difficult
to say. Omicron is the next one and that's no good, because that's
just the composition symbol. And Pi comes next.

But it already has another meaning.

It does, that's true. But it also stands for
permutations in another part of mathematics. I think we are
sufficiently distant.

It's not "pi" for process?

It is "pi" for process and "pi" for pointer and
parallel and a number of things. The point is that there's no really
good reason. It did occur to me that it is something very easy to
say. Turns out that people do find it easy to say. I didn't want to
use an acronym, like CCS. I wanted to use something that is a pure
name. It is a pure name. It even sounds pure.

Does it lead to confusion sometimes?

No. It's only things which are trying to mean
something, but mean different things to different people, that cause
confusion.

Grand Challenges, that is my last question: why
and what is it about?

You're talking about the Grand Challenges Exercise
initiated by the UK Computing Research Committee (UKCRC). It seemed
that we ought to express the scientific aspiration of the
computer science community. We should make clear that we aspire not
only to make things work but to have a science, an academic body of
knowledge. The first thing about a Grand Challenge is that it is
supposed to represent a bold aspiration that is purely scientific;
not on the back of some new funding initiative, or thought up
because it would be easy to find practical applications within three
or four years. It is trying to look beyond all that and represent
our academic interest.

You can ask how to translate these nice notions, of
being independent of funding and representing our academic interests,
into something more specific. And then you can say we don't know what
our aspirations are. So why don't we ask people to say what their
aspirations are? That's what we are doing, but in such a way that,
when they have said it and the community gets behind them, you
create something which has enough credibility that people are prepared
to follow it through. For example the community will create
workshops in order to discuss how to address those problems. In other
words, the community will feel encouraged to plan its future towards
those objectives. And, incidentally, the Research Council might fund
workshops that try to design projects to approach the
objectives.

That's what we are trying to do. Of course
these aspirations won't stand still. Our steady state should be a
slowly rolling menu of aspirations, each of which is the focus of a
community. Of course, this is not just a UK community; the Grand
Challenge Exercise is a way to maximise the part that UK communities
play in truly international research.