Summary
Ken Arnold, the original lead architect of JavaSpaces, talks
with Bill Venners
the data-driven nature of JavaSpaces, how JavaSpaces facilitates
decoupling,
and why iteration isn't
supported in the JavaSpace interface.

Ken Arnold has done a lot of design in his day. While at Sun Microsystems, Arnold
was one of the original architects of Jini technology and was the original lead architect of
JavaSpaces. Prior to joining Sun, Arnold participated in the original Hewlett-Packard
architectural team that designed CORBA.
While at UC Berkeley, he created the Curses library for terminal-independent screen-oriented programs.
In Part I of this
interview, which is being published in six weekly installments, Arnold explains why
there's no such thing as a perfect design, suggests questions you should ask yourself when
you design, and proposes the radical notion that programmers are people. In Part II,
Arnold discusses the role of taste and arrogance in design, the value of other
people's problems, and the virtue of simplicity.
In Part III,
Arnold discusses the concerns of distributed systems design, including the need to
expect failure, avoid state, and plan for recovery.
In Part IV, Arnold describes the basic idea of a JavaSpace,
explains why fields in entries are public, why entries are passive,
and how decoupling leads to reliability.
In this fifth installment, Arnold talks about
the data-driven nature of JavaSpaces, how JavaSpaces lets you "throw in
a grain and watch it grow,"
and why iteration isn't
supported in the JavaSpace interface.

Type versus State

Bill Venners:In both the Jini lookup service and in
JavaSpaces, you can look up objects by both type and state. For Jini lookups, you can
specify the types of service you want plus the types and values of attribute entries. In
JavaSpace reads and takes, you specify an entry type and field values. How do the kinds of
questions you ask in your query differ when you look up objects by type versus by
state?

Ken Arnold: The difference is function. If you are writing code that talks
to some data source, you can ask it questions. Method calls let you ask the data source for
the information you want, and get the answer back. You can view a particular set of
method calls as a particular socket shape. So you've written code that translates when you
compile it into these method invocations. If those method invocations are all resolved
locally, you would have another class to plug into that socket. With a Jini lookup service,
you actually go on a network and ask: Is there anything on the network that plugs into this
socket? Data is not the point there. It is only methods that matter in that part of the query.

JavaSpaces tries to accomplish something rather different. Yes, JavaSpaces is data-driven. It is object-oriented in the sense that the entries have type and you can match
subtypes, and that entry fields can be object types. But at some point, you get past objects
in any system. At some point, you call a method with an integer value. In some languages
that integer is still logically an object, but it doesn't contain other objects. At some point,
you hit the bottom. JavaSpaces is a bottom point in the sense that it is a way to make an
asynchronous method call. You can consider that the entry fields are like the parameters to
the method call. The fact that those are data shouldn't bother you, because that is when
you hit bottom.

Asking Questions of a Space

Bill Venners:How is it that I can take an entry template and
populate it with a few objects, and that forms a question? How do you get from bits and
bytes to high-level conceptual questions?

Ken Arnold: When you query a JavaSpace with an entry containing
some filled-in fields, you are saying, "These are the pieces I care about, please fill in the
rest." When you write an entry, each field is serialized separately. When you do a read or
take with an entry template, each field that you specify is serialized separately. The
JavaSpace compares the template fields with stored entries. For each field in the entry, if
something is specified in the template, the space compares the fields' serialized forms. The
space returns the first stored entry it finds in which all fields specified in the template
match the corresponding field in the entry.

Serialization and Private Data

Bill Venners:Somebody once told me he didn't like
serialization because it breaks encapsulation and you can see the private data.

Ken Arnold: I think he misunderstood the purpose of private data.
Most private data doesn't need to be private for you to hide it from other people because
those people should never know about it. Most private data needs to be private
because if you let people touch it, they will screw around with it. They'll think they know
the right thing to do with the data. Private data is a way of protecting yourself to allow
future change. What if somebody serializes an object, plays with the resulting bits, and
then deserializes the object to get another object? That is like saying, objects in C++ don't
matter because somebody can put a pointer to your private data and muck with it.
Although in some abstract logical sense that is true, it is not really the point.

The real point of private data is that you can prevent those people who are not trying
to screw you over, but who are just trying to know too much, from knowing too much. If
someone goes to that much trouble to interfere with your objects' internals, you should
ignore them. Because the value of private data is that you can release a second version and
all the existing client code should work because it doesn't rely on internal implementation
details.

Say you change the internal structure in your product's second version. If somebody's
code breaks because the code has serialized objects—because the person messed with the
private stuff and then reserialized it—I don't know how sympathetic you are likely to be.
It seems like that person has stepped outside the bounds and got what he or she deserved.
So few private things need to be private to be secret. Most of them just need to be private
as a language-enforced way of saying hands off.

Trusting Actors with Your Data

Bill Venners:To me, JavaSpaces has always felt like shared
memory between processes on different hosts.

Ken Arnold: Yes, it is often associated with shared memory.

Bill Venners:But JavaSpaces lets you share objects, not just
data. So JavaSpaces has a weird dual personality. It is about sharing objects, but to a great
extent, it is about sharing data too. And since data is shared, don't I have to trust that
every actor is correct and well behaved? Is a JavaSpace, therefore, appropriate only in
environments in which every actor trusts all the other actors?

Ken Arnold: You can view JavaSpaces as an alternative way to
design distributed systems compared to RPC (remote procedure call) mechanisms. In
either approach, you have to design a set of interactions. You must make tradeoffs about
complexity and trust in those designs. You could set up systems that do or do not detect
someone mucking with the system. Using a JavaSpace, at least one that other people can
access, you probably cannot achieve certain kinds of security, like data security. Can
someone read a particular entry in a space? If someone has access to the space, depending
on the space's security model, he could probably read the entry. But everything has its ups
and downs, and its tradeoffs. You can certainly design algorithms that are robust in the
face of others behaving incorrectly. It had better be possible, because the difference
between a bug and security hole is intention. At this point, most uses of JavaSpaces live
behind the firewall. Just as a form of entertainment, I am currently writing a poker game
that uses a JavaSpace to communicate between the participants and the decision maker.
One question is: What happens if somebody comes in and screws around with your data?
There are ways you can deal with that.

Ken Arnold: You can describe things in a million ways with all
sorts of metaphors, and each way is partially true. So if you want to think of JavaSpaces
as a manager, you can. The distinction has to do with decoupling sending the request from
receiving the response. If I make an asynchronous call to you and ask you to do
something, the job's processing is barricaded behind you. It doesn't mean
other people can't send you requests simultaneously, or that I can't send you another
request simultaneously in a separate thread. But this particular interaction is blocked on
that one call. Whereas with a JavaSpace, if I had 70 things to do, I can write 70 requests
into the space and just wait for the results, because the making of the request and the
receiving of the result are completely separate operations. In a normal RPC-style system,
they are consequent operations. They are serialized one after the other.

By "barricaded," the authors Eric Freeman and Susanne Hupfer basically mean that I
make the call to you. If the processing is going to be broken down into partial
pieces, you have to break it down. It is all behind you as far as I am
concerned. My interface to solving the problem is invoking a method on you. And if the
right way to do that is to do a little bit here, a little bit there, and a little bit over there,
then you have to handle that. If instead I write an entry into a space and the processing
can be broken down, an actor can take out the request and write a partial result back.

Today you could decide that those partial processing tasks have to happen in order—
that A happens before B, which happens before C. Tomorrow you may figure out a way
to do the tasks in parallel. So if something is already working on one request's part A, and
a new request comes in, something else can start on the B part without waiting for the A
task to complete. And then tomorrow a new theory can arrive. It is not mediated by my
interaction with someone processing my request. Instead, I am tossing the grain into
the space and watching it grow. It grows in the sense that the response comes back. All
that other stuff is not mediated by anything with which I directly interact. It is mediated by
the request-satisfying process, which ever way the entities processing the request are configured
to solve the problem.

Iterating over a JavaSpace

Bill Venners:Once people start using JavaSpaces, they often
find themselves wanting to iterate over matches. But they can't do that directly via the
JavaSpace interface. Why doesn't the JavaSpace interface
support iteration?

Ken Arnold: Basically the problem is there isn't one iteration model
that satisfies everything. It sounds like there should be and your instinct says there should
be. But you can ask all sorts of questions about ordering and about interaction with
transactions. If someone adds an entry after I start iterating, am I guaranteed to see it?
Can someone remove an entry if I am past it in an iterator? Is there a way to go
backwards? If you take all these factors and put them together—to make up a number off
the top of my head—there are maybe 80 possible iterators. And if I choose one, who is to say
that one is right for you?

Instead, we provide you a set of tools. It is like a RISC (reduced instruction set
computing) instruction set. This is ancient history, but there used to be a system called a
Vax. Some people's hair will turn gray when they hear that. Actually, their hair will already
be gray, but otherwise would turn gray. Anyway, the Vax had the world's largest instruction set as
far as I know. It had, for example, a solve quadratic equation instruction. This
was the reducto ad absurdum of instruction sets, for which RISC was a response. A RISC
instruction set doesn't even necessarily have a divide instruction. How can you live
without a divide construction? Well, if you combine some three instructions, you get this
kind of divide. If you combine some other five instructions, you get this other kind of divide.

Just because the JavaSpace interface doesn't offer a way to iterate doesn't mean there
aren't legitimate reasons to want to iterate. It is a question of whether the space should be
picking winners or losers. It is a question of whether you can even pick a winner that can
satisfy all people.

Everybody who has said they want iteration thinks they are asking for the same thing. But
when you break it down, different people want different things. Everyone wants iteration,
but if I give you one kind of iterator, it may not satisfy you. It may only satisfy 10 percent
of the people. Do I really want to add a method that only satisfies 10 percent of the
people?

The problem with the Vax instruction set was people had complaints like, "You didn't
solve the quadratic equation the way I wanted to do it." Or, "You're inefficient in the kind
of quadratic equation that I have." The JavaSpace interface doesn't offer a way to iterate,
but you can create a utility class, where you hand it the space and say, "Iterate!" And that
class is designed based on certain principles. And you could have another utility class that
iterates based on other principles.

Iteration is one of those things that sounds grand, like peace. Everybody wants peace,
brotherhood, and love. But how do you get there from here? What do you mean by peace?
Everybody has different visions. Granted, iteration over a JavaSpace is much less
important than peace, but the same issue is there. Iteration is a word that covers a
multitude of sins. It is better—and completely possible—to build iteration structures on
top of a JavaSpace. And then you can provide different APIs for iterators with different
properties.