The massive social network requires specialized tools, some of which have been offered externally via open source

With nearly 1 billion users worldwide and 500 million people visiting its social network every day, Facebook has its work cut out for it in managing its systems. To help do that, the company has been developing its own management and development tools tuned to its specific needs, rather than relying on commercial offerings.

Tools in use include Perflab, for testing site changes committed by engineers; Gatekeeper, for advanced A/B testing of code changes; and Claspin, providing a high-density heat map for viewing a large set of servers. "We spent a lot of time building up the internal tool stack," said Jay Parikh, Facebook vice president of infrastructure engineering, at the O'Reilly Velocity conference Tuesday in Santa Clara, Calif. The conference is focused on Web performance and operations, with Facebook serving as a prime example of the demands being made on the Web.

With Perflab, Facebook can test every code change committed by engineers. The tool helps Facebook push through thousands of code revisions per week. It also tracks back-end metrics, such as CPU usage and data-fetching. Gatekeeper, Parikh said, is "essentially an A/B testing framework on super steroids." It separates the release of code versus the activation of a feature in production. Claspin, meanwhile, gives a view of distributed systems in Facebook's infrastructure. "We're able to spot oncoming or up and coming problems and be able to drill down very quickly with just a couple clicks."

Facebook has built dozens of its own tools, Parikh said. While Facebook does not commercialize these tools, it does offer them via open source on occasion, such as it did with its Phabricator software fabrication tool last year, Parikh said. No decision has been made yet on whether Claspin, GateKeeper, or PerfLab could go this route. "These [tools] also are very ingrained with our system, so they're not easily generalizable. So we're not sure it would make sense to open source them yet.

Facebook has big tasks to undertake in the data management and coding realms. "Today, we will ingest 10 terabytes of log data into Hadoop," in about 30 minutes, Parikh said.