A Look at Google's Open Source Protocol Buffer

Monday Jul 14th 2008 by Sean Michael Kerner

Share:

With petabytes of data floating around, Google developed its own protocol for data interchange and now it's open sourcing it.

For most organizations Extensible Markup Language, or XML (define), is the lingua franca for data interchange. Apparently XML alone isn't fast enough for Google (NASDAQ: GOOG), so Google went off and developed its own data format, called Protocol Buffers.

This effort has been in development at Google since 2001. It's now available as an open source project Google hopes others will use and contribute toward. Protocol Buffers could ultimately replace XML in some cases as a speedier format for data interchange.

"We do know that we will be using it ourselves in some of our upcoming projects," Google developer Kenton Varda said. "This is not a piece of software that is unimportant to the company."

Google's documentation on Protocol Buffers noted that the new format has numerous advantages over XML. Among the advantages cited by Google is the fact that Protocol Buffers could be 3 to 10 times smaller and 20 to 100 times faster than XML for serializing structured data.

"You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages," Google's documentation states.

Currently Google is using Protocol Buffers for its internal Remote Procedure Calls, or RPC(define), protocols and file formats.

According to Google's documentation, protocol buffers were initially developed at Google to deal with an index server request/response protocol.

Chris DiBona, Google's program manager for open source, noted Google encodes almost any sort of structured information that needs to be passed across the network or stored on disk using this protocol.