Getting Started with Apache Kafka on OS X

Installation and a Simple Producer-Comsumer Example with Python

October 10, 2015

Apache Kafka is a highly-scalable publish-subscribe messaging system that can serve as
the data backbone in distributed applications. I use Kafka in my research platform to collect
process runtime data of large MPI applications in realtime. With Kafka’s Producer-Consumer
model it becomes easy to implement multiple data consumers that do live, in-flight application
monitoring as well persistent data storage for later analysis. In this post I describe how to set
up a single Kafka server on OS X and show a simple producer-consumer example with Python.

Installation

The best way to install the latest version of the Kafka server on OS X and to keep it up to
date is via Homebrew.

$> brew install kafka

This installs a few other dependencies, including Zookeper which is required to run the server.
Once everything has installed, you need to start Zookeeper before you can start Kafka.

A Simple Producer Consumer Example

Now let’s write a simple Python producer that periodically writes a string and a timestamp to a topic. Topics in
Kafka are simply message feed categories. Consumers only receive the messages for the topics they
have subscribed to.