Undestanding Distributed Systems

Bill Venners: How can you visualize and understand the complexity of
distributed systems? For example, in enterprise systems today it is often hard to turn
something off, because you don't know what's been connected to it over the years.
Understanding the complexity of one big application seems like a hard problem, but more
manageable than understanding a...

James Gosling: ...sea of things. Boy, there are a whole bunch of PhD
theses ready to be had about that topic. It is really hard. For example, in the Web services
model, you may publish a service descriptor that says, "Hi, I'm a service. This is what I
take. Talk to me." Eventually, something needs to change in that service, or maybe the service
has to go away for some reason. If you need to track down the dependencies in a large-scale
system where dependencies get established in a completely dynamic and ad hoc
basis, there's nothing that's as good as just maintaining a log of who has ever talked to
you.

In some sense this is kind of a hopeless problem, and maybe that's OK. And I say,
"maybe that's OK," because it really is a deeply difficult problem. For example, look at URLs, which make the Web work. Hypertext wasn't invented with
the Web. Hypertext had been around as a concept for 20 or 30 years. The earliest popular
description of this was this book called Computer Lib, a written a long time ago by Ted Nelson.
That book really was about what you could do with hypertext, and he had
this project called project Xanadu that was trying to do that. But they went off and did
the usual computer science thing, which is to try to solve all the hard problems and make
it perfect.

One of the hard problems is exactly what you were just asking about concerning distributed
systems. You've got a reference to a remote resource. What happens if that remote
resource moves? Should you keep the backtracking information? How do you keep the
backtracking information? Solving that problem is really, really, really hard. Lots of
people went running at that brick wall over, and over, and over again, trying to find a way
to make these large scale distributed references really work. In the computer science
academic world, it was generally considered that an internet link just wasn't of any
value unless it could handle resource moving and renaming and issues like that.

In some sense, the brilliant thing that Tim Berners-Lee did was simply to say, "I don't
care." For 20 years people had been failing to solve these problems in any large-scale
way. Berners-Lee decided to just do the simple obvious thing that solves the problem he
needed, namely, getting ahold of a resource. And that's actually an easy problem.
Coming up with those names, URLs, is a relatively straightforward thing. He did that,
and that enabled a lot of what the Web is today. But the Web has all these problems.
What happens if a Web page moves or gets deleted? That is exactly the problem of
maintaining or managing the configuration of any large scale distributed system. On the
one hand, the URL design has made the Web somewhat fragile. Broken links are all over
the place. On the other hand, if they had tried to really solve that problem, the Web never
would have happened, because the problem is just too hard.

So philosophically, I really don't know. Dealing with dynamic systems with pieces that
come and go is a really hard problem. There are all kinds of specialized solutions for
specialized situations, but I've never seen anything like a set of general solutions. In some
sense, this particular problem feels like one where unreliability may be a good thing, just
because it makes the whole enterprise possible. Maybe people should just get over it.

Next Week

Come back Monday, November 17 for Part II of a conversation with
Ruby's creator Hiruhito (Matz) Matsumoto. I am now staggering
the publication of several interviews at once, to give the reader
variety.
If you'd like to receive a brief weekly email
announcing new articles at Artima.com, please subscribe to
the Artima Newsletter.

Talk Back!

Have an opinion about refactoring tools, program visualization, or JavaDoc?
Discuss this article in the Articles Forum topic,
Visualizing Complexity.