Kafka Connector 3 Developer Preview 1

I’m glad to announce the first developer preview of the next major iteration of our integration with Kafka. This version is based on a new library for DCP, and supports the Kafka Connect framework. In this post I will show how it could be integrated to relay data from Couchbase to HDFS.

At this point everything is ready for setting up the link to transfer documents from Couchbase to HDFS using Kafka Connect. We assume you are running Couchbase Server on http://127.0.0.1:8091/ and Confluent Control Center on http://127.0.0.1:9021/. For this example, make sure you have the travel-sample bucket loaded on Couchbase. If you didn't set it up when setting up the cluster, you can add it through the settings part of the Web UI.

Once you have all of theese prerequisites out of the way, navigate to the section “Kafka Connect” in your Confluent Control Center. Select “New source”, then select “CouchbaseSourceConnector” as a connector class and fill in the settings so that the final JSON will be similar to:

Once you save the Source connection, the Connect daemon will start receiving mutations and storing them into specified Kafka topic. To demonstrate a full pipeline, lets setup a Sink connection to get data out of Kafka. To do so, go to “Sinks” tab, and click “New sink” button. It should ask for a topics where interesting data stored, enter “travel-topic”. Then select “HdfsSinkConnector” and fill in settings so that, the JSON config will look like this (assuming the HDFS name node is listening on hdfs://127.0.0.1:8020/):

That’s my quick runthrough example! The DCP client is still under active development and has some additional features being added to handle various topology change, failure scenarios. The next couple updates of our Kafka connector will pick up those updates. I should also briefly note that Couchbase's DCP client interface should be considered volatile for the moment. We use it in various projects, but you should only use it directly at your own risk.

Posted by Sergey Avseyev, SDK Engineer, Couchbase

Sergey Avseyev is a SDK Engineer at Couchbase. Sergey Avseyev is responsible for development of Kafka connector, and underlying library, which implements DCP, Couchbase replication protocol. Also maintaining PHP SDK for Couchbase.