Normally the data from a telescope is recorded in FITS (flexible image transport system) files, a binary transport mechanism that is used extensively for astronomy data. While this is adequate for small batches of information, when talking about the hundreds of millions of records that will eventually reside in SDSS data store, FITS is too cumbersome for rapid data access.

"In order for you to search for objects that were of interest for your research would take hours, maybe days," Thakar continues.
SDSS started out using an object oriented database (OODB), but that didn't meet the performance requirements. It decided to switch to a relational database.

Jim Gray, a "distinguished engineer" in Microsoft's Scalable Servers Research Group and manager of the company's Bay Area Research Center in San Francisco, California, helped SDSS set up on Microsoft's SQL Server 2000. The database resides on a series of off the shelf RAID 0/5 arrays with a total cost of under $10,000. The SQL database came on line with the Early Data Release in June 2001. Initially the SQL Server was just for the public access, while scientists would continue to use the OODB.

But that didn't last for long.

"In the first six months, the SQL database stole the show," says Thakar. "It was so much faster and easier to use that many of the scientists started using it too."

As a result, everything was moved over to SQL Server.

The SkyServer site offers visitors several options for getting data depending on their level of expertise. There are form-based queries that anyone can use. Hard core users can run SQL queries, or submit a batch file and come back later to view the results. Users can download their results in text, CSV or XML formats. Visitors can also use a graphic interface to locate an area of the sky, zoom in and click on a particular object to find out its properties.

So far, over 200 papers have been published based on data from the SDSS. And there are many more to come as its use speeds up the research process.

"Being able to pose questions in a few hours and get answers in a few minutes changes the way one views the data: you can experiment interactively," Microsoft's Jim Gray and Johns Hopkins University astronomy professor Alex Szalay wrote in their paper The World-Wide Telescope, an Archetype for Online Science. "When queries take three days and hundreds of lines of code, one asks many fewer questions and so gets fewer answers."