Tutorials focusing on Linux, programming, and open-source

Getting Started With Apache Kafka

What Is Kafka?

Kafka is a new publish-subscribe messaging system that uses a distributed, partitioned, and replicated. It can scale horizontally without downtime, and achieves durability with messages being persisted on disk and replicated within the cluster to prevent data loss. Apparently, each broker can handle terabytes of messages without performance impact, but I have not tested this.

Kafka requires Zookeeper (another Apache product), in order to run. Setting up Zookeeper cluster is beyond the scope of this tutorial, so for now we will just have this instance run a zookeeper cluster of one. Luckily, the Kafka download provides an easy way to do this:

We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites. More info.