Technology tips, tricks, and tidbits from a Systems Librarian

Monthly Archives: September 2017

I vaguely recall hearing about Google’s Protocol Buffers a while ago, but I forgot about them until I recently encountered a project using them. Since I wanted to use the data from the project, it seemed that I would have to learn how to use Protocol Buffers.

After looking at Google developer docs for a while, I settled on the Python tutorial at https://developers.google.com/protocol-buffers/docs/pythontutorial.

I’m working on a Windows 10 computer, but using Ubuntu on Windows so I already had an environment with Python 2.7.x installed.

The tutorial is pretty good, but it makes certain assumptions. I didn’t build the compiler from source. I installed a pre-compiled binary from https://github.com/google/protobuf/releases. This was the file: https://github.com/google/protobuf/releases/download/v3.4.0/protoc-3.4.0-linux-x86_64.zip.

I downloaded the .proto file from the project on which I’m working, and ran ./protoc –python_out OUT_DIR project.proto. That generated the Python code for interacting with that particular data structure.

I tried importing that into my project.py file, but it complained: “ImportError: No module named google.protobuf”. If I had followed the tutorial more closely, I think the assumption is that you’ll build the compiler from source and that you’ll also install your own protobuf runtime libraries. I could’ve done that, but I was lazy so I just installed pip (apt-get install python-pip) and then ran “pip install protobuf”.

According to some Google docs, the runtime library version should be the same as the compiler version, which makes sense. pip had version 3.4.0 which was the same version as the pre-compiled compiler binary I downloaded so that was handy.

Now my Google Protocol Buffer generated Python module is loading, so I’m off to try it out. I think the hardest bit is behind me now.

I’m actually really excited, because the project using the Protocol Buffers is a Java project, but I want a Python tool for interacting with the data from that Java project, and this should work pretty seamlessly. It seems like there are actually a lot of runtime implementations available for Protocol Buffers, so this would be a nice way of sharing data among a number of projects.

I wonder what sort of uptake Protocol Buffers see outside of Google. It seems that Google uses them a lot, but this project is the first time I’ve encountered them in the wild, and I think Protocol Buffers have been in the public domain for about a decade now. In some ways, they’re not as convenient as serializing as JSON or XML, but in other ways they seem a million times better.

I suppose I’ll form my ultimate opinions after I have some experience working with them. I’m intrigued so far though!