Kafka Indexing Service

This section describes how to use Apache Druid Kafka Indexing Service in E-MapReduce
to ingest Kafka data in real time.

The Kafka Indexing Service is an extension launched by Apache Druid to ingest Kafka
data in real time using Apache Druid's indexing service. The extension enables supervisors
in Overlord which start some indexing tasks in Middlemanager. These tasks connect
to the Kafka cluster to ingest the topic data and complete the index creation. You
need to prepare a data ingestion format file and manually start the supervisor through
the RESTful API.

You can adjust the parameters based on your needs. The /kafka-1.0.0 section of the - -zookeeper parameter is path, and you can see the value of the zookeeper.connect on the Kafka
service Configuration page of the Kafka cluster. If you build your own Kafka cluster, the parmname —zookeeper parameter can be changed according to your actual configuration.

Define the data format description file for the data source. Name it metrics-kafka.json
and place it in the current directory (or another directory that you have specified).