Fair Use: Please note that use of the Netcraft site is
subject to our Fair Use and Copyright policies. For more information,
please visit http://www.netcraft.com/about-netcraft/fair-use-copyright/,
or email info@netcraft.com.

Interview with Jim Gray, Manager, Microsoft Bay Area Research Center

Jim Gray won the 1998 Turing Award "for seminal
contributions to database and transaction processing research." More recently, he has been
working as a Distinguished Engineer in Microsoft's Scalable Servers Research Group,
based in San Francisco, on the creation of terabyte-sized distributed online databases. Talking
with Glyn Moody, Gray reflects on his career, the power of Web services, and the arrival of
sentient machines later this century.

Q.How did you first get involved in working with databases?

A. My primary interest had been in theoretical computer science - I got a PhD in
computer science at Berkeley. I went to work at IBM and I was working on a variety of things,
but they all more or less revolved around operating systems and programming languages and
applications.

My boss came to me one day, and he gave me some career advice and said, "you know,
IBM has more operating systems than they need " - this was about 1971 - "and we have more
programming languages than we need. We also have more operating systems and
programming language researchers than we need. If you have an interest in making a
contribution, as opposed to just polishing a round ball, you would be well advised to work on
networking or databases because those are areas where we are completely clueless."

Q.How did your work on transaction processing come out of that?

A. We were a group working in the general area of database systems. And so we
were sitting around in a room and the question came up, who is going to do what? Since I
was the operating systems guy in the group, I got to do all the operating systems stuff. It so
happens that that includes things like systems startup and shutdown, security and
authorisation and the basic issues of concurrency control execution and so on. When you get
involved in startup and shutdown, you also get involved in restart, what happens when things
fail. So I fell heir to the whole business of cleaning up the mess after programs all crashed
and bringing the world back to the state it was in before the crash.

Q.Did fault tolerance arise in a similarly organic way?

A. Indeed. I was a great fan of something called defensive programming, and still
am. Defensive programming says whenever anybody calls you, you check all your
parameters and whenever you do anything with some information that you have you check
first that the information is correct. And when you're about to return your results, you give it a
sniff test to make sure the results look OK.

Unfortunately, when you put a lot of that stuff in your code, your code is the least reliable
code in the system. Because whenever anybody calls you with bad parameters, yours is the
code that fails. So if the system crashed, it probably crashed in my code, because I was
basically the only person who was checking what was going on. And so people would
complain to me a lot about how much the system was crashing and how bad my code was. I
probably did have the code with the most bugs, but I worked fairly hard to make the system
restart automatic and fast and tried to put in a system that would tolerate faults by quickly
recovering and resetting the state to a good state.

Q.What about your more recent interest in scalable servers - how did that
follow from the work on fault tolerance?

A. Subsequently I went and worked at Tandem Computers. I was very interested
in fault tolerance, and they were at the time building computers that were called non-stop
systems. The basic idea of Tandem was suggested by the name, which is a tandem bicycle,
in tandem: that you had multiple computers working on the task, and if one of the computers
or one of the discs got in trouble, the remaining computers could continue to deliver service.
There are many aspects of the Tandem architecture, or this modular architecture. We see it
today in Web farms. The usual Web farm at AOL or MSN or Google or Inktomi or Yahoo is
literally thousands of servers. So the desire to build very, very large computers out of many
small ones is an outgrowth of my experiences at Tandem. These are now called blade
servers and Beowulf clusters, but the more generic term is just scalable computing.

Q.How did the TerraServer project come about?

A. I had left Tandem and went to work at Digital Equipment Corporation, and
Digital Equipment Corporation about 1994 went out of the software business, so it was time
for me to leave. I took a leave of absence for a year, and then I came to work at Microsoft.
I've been at Microsoft since about '95. We were chartered as a research group to work on
scalable systems.

We actually had a pretty simple problem on our hands: we needed to find an application
that would be interesting to millions of people, and that could be put on the Internet, and that
would show off our technologies and would not be offensive to anybody and that involved very
large databases. Interesting to everyone and offensive to no one is a really big challenge.
We pretty quickly eliminated porn, although it's interesting to a lot of people it offends an equal
number. We also came to the conclusion that any traditional database application wasn't
going to give us a large database.

It was clear we needed to have an image database. We had some relationship to people
who were doing spatial databases, so we decided to try and take the US Geological Survey
(USGS) image of the United States, which is a 1 metre
resolution photograph of the United States, and put that online, and work with the Russian
space agency to put their assets online.

So we started building the TerraServer in
about '96. In late '96 we had a demonstration of it, in mid '97 it came online. It's been online
ever since. When it first came online it was only about 600 Gigabytes, which now seems
pretty modest - at the time it seemed huge to us. It's now in the 5 terabytes range of online
data, and it is in fact derived from about 20 terabytes of raw data.

Historically, we got data from the USGS in the form of tapes. That was, in 1997, 1998 or
even in the year 2000, the most economic way of moving data. But the USGS has learned
that the best and least expensive way of moving data is to write them to disc rather than to
tape, and then ship the discs. The USGS is a fan of FireWire discs, so they deliver to us a fairly large
box of FireWire discs, in round numbers a terabyte or two of data, which we then make a copy
of and forward to the next recipient of the data.

The alternative scheme is rather than sending around discs, we send around entire
computers. The virtue of sending an entire computer is you just plug it into the network - you
don't have to go through any discussions of what's the file system, and what's the format and
all that sort of stuff. So I think in the future people will actually find themselves archiving and
exchanging computers rather than raw disc drives.

Q.Isn't that rather ironic in the age of the Internet?

A. It is, and has to do with the cost of sending data through the PTTs. I believe
that if price reflected cost that this would not be required, but, at least in America, the
communication links are priced at the same approximate rate as voice grade links. I think that
the telcos are terrified that if they lowered their prices and made bandwidth essentially free,
every neighbourhood would set up a PBX and would have one link and they wouldn't pay any
subscription services. There's many, many fibre optic cables running right in front of my
building that are just completely dark. The cable's there, the cost of delivering it to me would
be close to zero, but for tariff reasons I can't get access to that stuff - the quoted tariff is a
thousand dollars a megabit per second per month. It is throttling many of our design
decisions. It does cause us to act in strange ways.

Q.How has your work developed beyond TerraServer?

A. The project that I've been involved with since about '98 is helping the
astronomers get their data online and get it all integrated as one large Internet-scale
database, called the Virtual Observatory. The idea is
astronomy data, most of it, goes into a computer - there's no people looking through the lens
of the Hubble Space Telescope, it all comes down to the Earth and goes on computer discs.
That's true even of the terrestrial telescopes that are generating gigabytes of data a day.
People can't look at gigabytes per day. Only computers look at the data and people look at
the output of the computer programs.

The premise is that we can cross-correlate that data and have a better telescope than any
other telescope in the world. It would cover all of the known data, so it would have this
temporal dimension of going back to the beginning of recorded history. It would be all spectral
bands - radio, infrared, ultraviolet - and it would be from all parts of the world. It would be a
much more powerful telescope than any individual one. And, it can cross-correlate the data
with the scientific literature.

Q.How big are the datasets?

A. The SkyServer is one part of the
Virtual Observatory. In round numbers it's about 12 terabytes of released data at this point.
We've been evolving the design since 2000, and we're adding features, but it is not our main
thrust any more; the real excitement now is to take several archives and glue them together.

Q.How will you be doing that?

A. I'm very enthusiastic about using Web services to do it. We built a prototype
called SkyQuery. If you go to Skyquery.net, that is a
portal that knows about many different archives. You can go there and ask a query and the
portal decomposes your query and sends it to the relevant archives and brings the information
that's synthesised from those archives. The portal uses SOAP and WSDL and XML and
datasets to talk among the Sky nodes, as they're called. It would be impossible for us to do
this without Web services. There's so much that's been done for us in terms of having a
standard representation, being able to move stuff in and out of a database, being able to
convert it to XML in a heartbeat, having the Internet as a substructure.

Q.Google has changed the way most of us use the Internet. Looking
forward, how do you see even larger databases affecting people's use of the Internet?

A. Well, so, as far as I know - and this is just a conjecture - I believe that Google
is in the 1 to 5 petabyte range at this point. The interesting thing about Google is that it
indexes the surface Web, it doesn't index the deep Web. The astronomy stuff I've been
talking about is part of the deep Web. Google has focused a lot on text search, but they are
generalising, they are doing things like Orkut, which an
index of friends, and they're doing Froogle, and they
are moving into other spaces. But for somebody like the astronomers, there's not a commercial model that would cause the Google guys to want to do all of the astronomy data.

So I suspect that we will have Google indexing the surface Web and text, and I think other
people will do similar things. If you want to dig deep into Amazon you'll probably go to
Amazon; if you want to dig deep into IBM or Microsoft, you will go to IBM or Microsoft; and if
you want to dig deep into astronomy you will go to an astronomy portal. It will be a two or
more level architecture.

Q.What about the knock on effect of these technologies on everyday life?

A. The thing I've been talking about and working on is the enterprise-level
scalable servers, big database, etc., for a community like the astronomers. My colleagues are
working on a project called MyLifeBits,
and that project is trying to record all of their personal experiences, what they see, what they
hear. It's very much inspired by the work of Vannevar Bush on memex.

A challenge that we all face is that the information avalanche has arrived and we are
buried under a mountain of email and documents. I find myself spending an increasing
amount of time looking for things. I don't think it's just that I'm getting older and my memory is
failing. I think it's actually that there's more stuff and much of what I do has a lot of context
associated with it, so I need to go back and refer to things from a long time ago.

What we're trying to do is to augment people's intelligence and make it easier for them to
find things and to take the information that they have and organise it and summarise it and
make it more accessible. That is a huge focus at Microsoft. Our strategic intent - the phrase
that you say when people ask What does your company do? - is Information at your
Fingertips. The astronomy stuff I'm doing is Information at your Fingertips for the astronomy
community, the MyLifeBits stuff is Information at your Fingertips for all of us.

Q.Where do you think we are ultimately heading with computers?

A. I believe that Alan Turing was right and that eventually machines will
be sentient. And I think that's probably going to happen in this century. There's much
concern that that might work out badly; I actually am optimistic about it.