Google may chuck Spanner into Datastore

Google I/O Google may make its globally-distributed Spanner database available as a cloud service as the company tries to let developers fiddle with its innards.

The Spanner database* is the successor to the BigTable/Megastore architecture on which Google's just-announced Cloud Datastore is built, and has some more-advanced features, such as snappy access from anywhere in the world for replicated datasets due to its ability to write locally and sync globally without being crushed by latency.

The database could be coming to Google's cloud platform as a standalone service, a Googler hinted on Thursday. But exposing it could be a challenge as the system is so esoteric that it may be difficult for lay developers to access.

"If you were to flip the switch on Google's infrastructure right now and make it public, I think people would freak out," Google cloud product manager Chris Ramsdale said during a panel discussion on distributed databases at Google I/O. "It's not the way everybody wants to program."

However, when asked by a member of the audience if Google would consider making Spanner available to the general public, he said "it's actively being debated", before stressing "not yet."

Because Spanner naturally distributes data across the world while assuring consistency through a clever timestamp feature, there's less need to go between datacenters via fibre to check on replicated pools of data, Ramsdale said, which is how people will need to design system in five to ten years.

"Spanner in many ways is the next evolution of Megastore," he said. "It really gets us to the global footprint".

Google is in the process of discussing how best to surface Spanner, he said, with one possible route being an API on top of it, though another could be a fully managed service. Spanner is an example of the fine line Google needs to walk between wanting to expose its internal infrastructure to the developer community so as to sway applications away from Amazon, and how this creates work through the need to polish and manage the technology as a service.

The approach Google has taken with the Cloud Datastore of breaking apart its infrastructure so it can be consumed as individual services by developers is the "right way" of doing things, Ramsdale said.

When this vulture asked a Googler about the potential difficulties of exposing something like Spanner to the lay user, Google software engineer David Gay indicated it was possible to do so while preserving its neat features. "It's the Spanner servers which have TrueTime, not the clients – in some sense [data is] timestamped at the arrival at the server," Gay said. ®

* Bootnote:

Spanner is designed to work as the data layer for as many as 10 million servers, compared to the million or so that Google operates today. Along with being the successor to BigTable/MegaStore, it has several idiosyncratic features, such as externally consistent distributed transactions, made possible by the fact it depends upon a global fleet of atomic clocks and GPS systems to provide a "TrueTime API" that gets around the timestamping latency problem of operating a global database.