Role in IT decision-making process:Align Business & IT GoalsCreate IT StrategyDetermine IT NeedsManage Vendor RelationshipsEvaluate/Specify Brands or VendorsOther RoleAuthorize PurchasesNot Involved

Work Phone:

Company:

Company Size:

Industry:

Street Address

City:

Zip/postal code

State/Province:

Country:

Occasionally, we send subscribers special offers from select partners. Would you like to receive these special partner offers via e-mail?YesNo

Your registration with Eweek will include the following free email newsletter(s):News & Views

By submitting your wireless number, you agree that eWEEK, its related properties, and vendor partners providing content you view may contact you using contact center technology. Your consent is not required to view content or use site features.

By clicking on the "Register" button below, I agree that I have carefully read the Terms of Service and the Privacy Policy and I agree to be legally bound by all such terms.

Google Testing New Storage System for 'Caffeine'

This new storage system, the back end of the new Caffeine search engine that Google introduced Aug. 10 and is now testing, will include more diagnostic and historic data and autonomic software, so the system can think more for itself and solve problems long before human intervention is actually needed.

Google has been ahead of its time in more than just Web search and online consumer tools. Out of sheer necessity, it's also been way ahead of the curve in designing massive-scale storage systems built mostly on off-the-shelf servers, storage arrays and networking equipment.As the world's largest Internet search company continues to grow at a breakneck pace, it is now in the process of creating its second custom-designed data storage file system in 10 years.This new storage system, the back end of the new Caffeine search engine that Google introduced Aug. 10 and is now testing, will include more diagnostic and historic data and autonomic software, so the system can think more for itself and solve problems long before human intervention is actually needed.Who knew 10 years ago, when it was the newbie on the block to Yahoo's market-leading search engine, that Google would grow into a staple of Internet organization that is relied upon by hundreds of millions of users each day?Just before Rackable sold Google its first 10,000 servers in 1999 and started the company on a server-and-array collection rampage that may total in the hundreds of thousands of boxes, Google engineers were pretty much into making their own servers and storage arrays."In 1999, at the peak of the dot-com boom when everybody was buying nice Sun machines, we were buying bare motherboards, putting them on corkboard, and laying hard drives on top of it. This was not a reliable computing platform," Sean Quinlan, Google's lead software storage engineer, said with a laugh at a recent storage conference. "But this is what Google was built on top of."It would be no surprise to any knowledgeable storage engineer that this rudimentary file system had major problems with overheating to go with numerous networking and PDU failures."Sometimes, 500 to 1,000 servers would disappear from the system and take hours to come back," Quinlan said. "And those were just the problems we expected. Then there are always those you didn't expect."Eventually, Google engineers were able to get their own clustered storage file system-called, amazingly enough, Google File System (GFS)-up and running with decent performance to connect all these quickly custom-built servers and arrays. It consisted of what Quinlan called a "familiar interface, though not specifically Posix. We tend to cut corners and do our own thing at Google."What Google was doing was simply taking a data center full of machines and layering a file system as an application across all the servers to get open/close/read/write, without really caring where the data is in the machine, Quinlan said.But there was a big problem. The GFS lacked something very basic: automatic failover if the master went down. Admins had to manually restore the master, and Google went dark for as long as an hour at times. Although failover was later added, when it kicked in it was annoying to users because the lapse often was several minutes in length. Quinlan says it's down now to about 10 seconds.Eventually, the growth of the company and its subsequent IPO in 2004 spurred even more growth, so a modification to the file system was designed and built. This was called BigTable (developed in 2005-06), a distributed database-like file system built atop GFS with its own "familiar" interface; Quinlan said it is not Microsoft SQL.This is the part of the system that runs user-facing applications. There are hundreds of instances (called cells) of each of these systems, and each of those cells scales up into thousands of servers and petabytes of data, Quinlan said.That's, ahem, a lot of storage space to govern.At the base of much of this are Rackable's Eco-Logical storage servers, which are clustered to run on Linux to produce storage capacity as high as 273TB per cabinet. Of course, Google now uses a wide array of storage vendors because it's all but impossible for one vendor to supply the huge number of boxes needed by the search monster each year.The Eco-Logical storage arrays feature high-efficiency, low-power consumption and intelligent design intended to improve price performance per watt, in even very complex computing environments, Geoffrey Noer, Rackable's senior director of product management, told eWEEK.