UDP to Kafka

The UDP to Kafka origin reads messages from one
or more UDP ports and writes each message directly to Kafka.

Use the UDP to Kafka origin to read large volumes of data from multiple UDP ports and
write the data immediately to Kafka, without additional processing.

Here is an example of the recommended architecture for using the UDP to Kafka origin:

If you need to process data before writing it to Kafka, need to write to a destination
system other than Kafka, or if the origin does not need to process high volumes of data,
use the UDP Source origin.

UDP to Kafka can process collectd messages, NetFlow 5 and NetFlow 9 messages, and the
following types of syslog messages:

Non-standard common messages, such as RFC 3339 dates with no version
digit

When processing NetFlow messages, the stage generates
different records based on the NetFlow version. When processing NetFlow 9,
the records are generated based on the NetFlow 9 configuration properties.
For more information, see NetFlow Data Processing.

When you configure UDP to Kafka, you specify the UDP ports to use, Kafka configuration
information, and advanced properties such as the maximum number of write requests.

You can add Kafka configuration properties and enable Kafka security as needed.

Pipeline Configuration

When you use a UDP to Kafka origin in a pipeline, connect the origin to a Trash
destination.

The UDP to Kafka origin writes records directly to
Kafka. The origin does not pass records to its output port, so you
cannot perform additional processing or write the data to other destination
systems.

However, since a pipeline requires a destination, you should
connect the origin to the Trash destination to satisfy pipeline validation
requirements.

A pipeline with the UDP to Kafka origin should look like this:

Additional Kafka Properties

You can add custom Kafka configuration
properties to the UDP to Kafka origin.

When you add a Kafka configuration property, enter the exact property name and
the value. The stage does not validate the property names or values.

Several properties are defined by default, you can edit or remove the properties
as necessary.

Note: Because the stage uses several configuration properties, it ignores
user-defined values for the following properties:

key.serializer.class

metadata.broker.list

partitioner.class

producer.type

serializer.class

Enabling Kafka Security

When using Kafka version 0.9.0.0 or later,
you can configure the UDP to Kafka origin to connect securely through SSL/TLS, Kerberos,
or both.

Earlier versions of Kafka do not support security.

Enabling SSL/TLS

Perform the following steps to enable the UDP to Kafka
origin to use SSL/TLS to connect to Kafka version 0.9.0.0 or later.

To use SSL/TLS to connect, first make sure Kafka is
configured for SSL/TLS as described in the Kafka documentation.

On the General tab of the stage, set
the Stage Library property to Apache Kafka 0.9.0.0 or a
later version.

On the Connection tab, add the
security.protocol Kafka configuration property and
set it to SSL.

Then, add the following SSL Kafka configuration
properties:

ssl.truststore.location

ssl.truststore.password

When the Kafka broker requires client authentication - when the
ssl.client.auth broker property is set to "required" - add and configure the
following properties:

ssl.keystore.location

ssl.keystore.password

ssl.key.password

Some brokers might require adding the following properties as
well:

ssl.enabled.protocols

ssl.truststore.type

ssl.keystore.type

For details about these properties, see the Kafka
documentation.

For example, the following properties allow the stage to use SSL/TLS to
connect to Kafka 0.9.0.0 with client authentication:

Enabling Kerberos (SASL)

When you
use Kerberos authentication, Data Collector
uses the Kerberos principal and keytab to connect to Kafka version 0.9.0.0 or later.
Perform the following steps to enable the UDP to Kafka origin to use Kerberos to connect
to Kafka.

To use Kerberos, first make sure Kafka is configured for
Kerberos as described in the Kafka documentation.

Add the Java Authentication and Authorization
Service (JAAS) configuration properties required for Kafka clients based on your
installation type:

RPM or tarball installation - Add the properties
to the JAAS configuration file used by Data Collector - the
$SDC_CONF/ldap-login.conf file. Add the following
properties to a client login section in the file named
KafkaClient:

Cloudera Manager installation - Add the
properties to the Data Collector Advanced Configuration Snippet
(Safety Valve) for generated-ldap-login-append.conf field for
the StreamSets service in Cloudera Manager. Add the properties to the
field as
follows:

Add the Java Authentication and Authorization
Service (JAAS) configuration properties required for Kafka clients based on your
installation type:

RPM or tarball installation - Add the properties
to the JAAS configuration file used by Data Collector - the
$SDC_CONF/ldap-login.conf file. Add the following
properties to a client login section in the file named
KafkaClient:

Cloudera Manager installation - Add the
properties to the Data Collector Advanced Configuration Snippet
(Safety Valve) for generated-ldap-login-append.conf field for
the StreamSets service in Cloudera Manager. Add the properties to the
field as
follows:

The maximum number
of milliseconds to cache an idle template. Templates
unused for more than the specified time are evicted from
the cache. For more information about templates, see
Caching NetFlow 9 Templates.