Kafka Security

Client-Broker Security with TLS

Kafka allows clients to connect over TLS. By default, TLS is disabled, but can be turned on as needed.

Step 1: Generating Keys and Certificates for Kafka Brokers

Generate the key and the certificate for each machine in the cluster using the Java keytool utility. See Generate TLS Certificates.

Make sure that the common name (CN) matches the fully qualified domain name (FQDN) of your server. The client compares the CN with the DNS domain name to ensure that it is connecting to
the correct server.

Step 2: Creating Your Own Certificate Authority

You have generated a public-private key pair for each machine and a certificate to identify the machine. However, the certificate is unsigned, so an attacker can create a certificate and
pretend to be any machine. Sign certificates for each machine in the cluster to prevent unauthorized access.

A Certificate Authority (CA) is responsible for signing certificates. A CA is similar to a government that issues passports. A government stamps (signs) each passport so that the
passport becomes difficult to forge. Similarly, the CA signs the certificates, and the cryptography guarantees that a signed certificate is computationally difficult to forge. If the CA is a genuine
and trusted authority, the clients have high assurance that they are connecting to the authentic machines.

openssl req -new -x509 -keyout ca-key -out ca-cert -days 365

The generated CA is a public-private key pair and certificate used to sign other certificates.

Add the generated CA to the client truststores so that clients can trust this CA:

Note: If you configure Kafka brokers to require client authentication by setting ssl.client.auth to be requested or
required on the Kafka brokers config, you must provide a truststore for the Kafka brokers as well. The
truststore must have all the CA certificates by which the clients keys are signed. The keystore created in step 1 stores each machine’s own identity. In contrast, the truststore of a client stores
all the certificates that the client should trust. Importing a certificate into a truststore means trusting all certificates that are signed by that certificate. This attribute is called the chain of
trust. It is particularly useful when deploying SSL on a large Kafka cluster. You can sign all certificates in the cluster with a single CA, and have all machines share the same truststore that
trusts the CA. That way, all machines can authenticate all other machines.

Step 3: Signing the Certificate

Now you can sign all certificates generated by step 1 with the CA generated in step 2.

where kafka-broker-host-name is the FQDN of the broker that you selected from the Instances page in Cloudera Manager.
In the above sample configurations we used PLAINTEXT and SSL protocols for the SSL enabled brokers.

Other configuration settings may also be needed, depending on your requirements:

ssl.client.auth=none: Other options for client authentication are required, or requested, where clients without certificates can still connect. The use
of requested is discouraged, as it provides a false sense of security and misconfigured clients can still connect.

ssl.cipher.suites: A cipher suite is a named combination of authentication, encryption, MAC, and a key exchange algorithm used to negotiate the security
settings for a network connection using TLS or SSL network protocol. This list is empty by default.

ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1: Provide a list of SSL protocols that your brokers accept from clients.

Note: Due to import regulations in some countries, Oracle implementation of JCA limits the strength of cryptographic algorithms. If you need
stronger algorithms, you must obtain the JCE Unlimited Strength Jurisdiction Policy Files and install them in the JDK/JRE as described in JCA Providers Documentation.

After SSL is configured your broker, logs should show an endpoint for SSL communication:

Other configuration settings might also be needed, depending on your requirements and the broker configuration:

ssl.provider (Optional). The name of the security provider used for SSL connections. Default is the default security provider of the JVM.

ssl.cipher.suites (Optional). A cipher suite is a named combination of authentication, encryption, MAC, and a key exchange algorithm used to negotiate the
security settings for a network connection using TLS or SSL network protocol.

ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1. This property should list at least one of the protocols configured on the broker side.

ssl.truststore.type=JKS

ssl.keystore.type=JKS

Using Kafka’s Inter-Broker Security

Kafka can expose multiple communication endpoints, each supporting a different protocol. Supporting multiple communication endpoints enables you to use different communication protocols
for client-to-broker communications and broker-to-broker communications. Set the Kafka inter-broker communication protocol using the security.inter.broker.protocol
property. Use this property primarily for the following scenarios:

Enabling SSL encryption for client-broker communication but keeping broker-broker communication as PLAINTEXT. Because
SSL has performance overhead, you might want to keep inter-broker communication as PLAINTEXT if your Kafka brokers are behind a firewall
and not susceptible to network snooping.

Migrating from a non-secure Kafka configuration to a secure Kafka configuration without requiring downtime. Use a rolling restart and keep security.inter.broker.protocol set to a protocol that is supported by all brokers until all brokers are updated to support the new protocol.

For example, if you have a Kafka cluster that needs to be configured to enable Kerberos without downtime, follow these steps:

These protocols can be defined for broker-to-client interaction and for broker-to-broker interaction. The property security.inter.broker.protocol allows
the broker-to-broker communication protocol to be different than the broker-to-client protocol, allowing rolling upgrades from non-secure to secure clusters. In most cases, set security.inter.broker.protocol to the protocol you are using for broker-to-client communication. Set security.inter.broker.protocol to a protocol
different than the broker-to-client protocol only when you are performing a rolling upgrade from a non-secure to a secure Kafka cluster.

Enabling Kerberos Authentication

Apache Kafka supports Kerberos authentication, but it is supported only for the new Kafka Producer and Consumer APIs.

If you already have a Kerberos server, you can add Kafka to your current configuration. If you do not have a Kerberos server, install it before proceeding. See Enabling Kerberos Authentication for CDH.

If you already have configured the mapping from Kerberos principals to short names using the hadoop.security.auth_to_local HDFS configuration property,
configure the same rules for Kafka by adding the sasl.kerberos.principal.to.local.rules property to the Advanced Configuration Snippet for Kafka Broker Advanced
Configuration Snippet using Cloudera Manager. Specify the rules as a comma separated list.

To enable Kerberos authentication for Kafka:

In Cloudera Manager, navigate to Kafka > Configuration.

Set SSL Client Authentication to none.

Set Inter Broker Protocol to SASL_PLAINTEXT.

Click Save Changes.

Restart the Kafka service (Action > Restart).

Make sure that listeners = SASL_PLAINTEXT is present in the Kafka broker logs, by default in /var/log/kafka/server.log.

Create a jaas.conf file with either cached credentials or keytabs.

To use cached Kerberos credentials, where you use kinit first, use this configuration.

Topic Authorization with Kerberos and Sentry

Configuring Kafka to Use Sentry Authorization

The following steps describe how to configure Kafka to use Sentry authorization. These steps assume you have installed Kafka and Sentry on your cluster.

Sentry requires that your cluster include HDFS. After you install and start Sentry with the correct configuration, you can stop the HDFS service. For more information, see Installing and Upgrading the Sentry Service.

Note: Cloudera's distribution of Kafka can make use of LDAP-based user groups when the LDAP directory is synchronized to Linux via tools such as
SSSD. CDK does not support direct integration with LDAP, either through direct Kafka's LDAP authentication, or via Hadoop's group mapping (when hadoop.group.mapping is
set to LdapGroupMapping). For more information, see Configuring LDAP Group Mappings.

To configure Sentry authentication for Kafka:

In Cloudera Manager, go to Kafka > Configuration.

Select Enable Kerberos Authentication.

Select a Sentry service in the Kafka service configuration.

Add superusers.

Superusers can perform any action on any resource in the Kafka cluster. The kafka user is added as a superuser by default. Superuser requests are
authorized without going through Sentry, which provides enhanced performance.

Authorizable Resources

Authorizable resources are resources or entities in a Kafka cluster that require special permissions for a user to be able to perform actions on them. Kafka has four authorizable
resources.

Cluster: controls who can perform cluster-level operations such as creating or deleting a topic. This resource can only have one value, kafka-cluster, as one Kafka cluster cannot have more than one cluster resource.

Topic: controls who can perform topic-level operations such as producing and consuming topics. Its value must match exactly the topic name in the Kafka
cluster.

With CDH 5.15.0 and CDK 3.1 and later, wildcards (*) can be used to refer to any topic in the privilege.

Consumergroup: controls who can perform consumergroup-level operations such as joining or describing a consumergroup. Its value must exactly match the
group.id of a consumergroup.

With CDH 5.14.1 and later, you can use a wildcard (*) to refer to any consumer groups in the privilege. This resource is useful when used with Spark Streaming, where a generated
group.id may be needed.

Host: controls from where specific operations can be performed. Think of this as a way to achieve IP filtering in Kafka. You can set the value of this
resource to the wildcard (*), which represents all hosts.
Note: Only IP addresses should be specified in the host component of Kafka Sentry privileges, hostnames are not supported.

Authorized Actions

You can perform multiple actions on each resource. The following operations are supported by Kafka, though not all actions are valid on all resources.

ALL is a wildcard action, and represents all possible actions on a resource.

read

write

create

delete

alter

describe

clusteraction

Authorizing Privileges

Privileges define what actions are allowed on a resource. A privilege is represented as a string in Sentry. The following rules apply to a valid privilege.

Can have at most one Host resource. If you do not specify a Host resource in your privilege string, Host=* is assumed.

Must have exactly one non-Host resource.

Must have exactly one action specified at the end of the privilege string.

For example, the following are valid privilege strings:

Host=*->Topic=myTopic->action=ALL
Topic=test->action=ALL

Granting Privileges to a Role

The following examples grant privileges to the role test, so that users in testGroup can create a topic named testTopic and
produce to it.

The user executing these commands must be added to the Sentry parameter sentry.service.allow.connect and also be a member of a group defined in
sentry.service.admin.group.

Before you can assign the test role, you must first create it. To create the test role:

kafka-sentry -cr -r test

To confirm that the role was created, list the roles:

kafka-sentry -lr

If Sentry privileges caching is enabled, as recommended, the new privileges you assign take some time to appear in the system. The time is the time-to-live interval of the Sentry
privileges cache, which is set using sentry.kafka.caching.ttl.ms. By default, this interval is 30 seconds. For test clusters, it is beneficial to have changes appear
within the system as fast as possible, therefore, Cloudera recommends that you either use a lower time interval, or disable caching with sentry.kafka.caching.enable.

Allow users in testGroup to write to testTopic from localhost, which allows users to
produce to testTopic. Users need both write and describe permissions.

Note that you have to pass a configuration file, producer.properties, with information on JAAS configuration and other Kerberos authentication related
information. See SASL Configuration for Kafka Clients.

$ kafka-console-producer --broker-list localhost:9092 \
--topic testTopic --producer.config producer.properties
This is a message
This is another message

Note that you have to pass a configuration file, consumer.properties, with information on JAAS configuration and other Kerberos authentication related
information. The configuration file must also specify group.id as testconsumergroup.

kafka-console-consumer --new-consumer --topic testTopic \
--from-beginning --bootstrap-server anybroker-host:9092 \
--consumer.config consumer.properties
This is a message
This is another message

Troubleshooting Kafka with Sentry

If Kafka requests are failing due to authorization, the following steps can provide insight into the error:

Make sure you have run kinit as a user who has privileges to perform an operation.

Identify which broker is hosting the leader of the partition you are trying to produce to or consume from, as this leader is going to authorize your request against Sentry. One easy
way of debugging is to just have one Kafka broker. Change log level of the Kafka broker by adding the following entry to the Kafka Broker in Logging Advanced Configuration Snippet (Safety Valve) and
restart the broker:

log4j.logger.org.apache.sentry=DEBUG

Setting just Sentry to DEBUG mode avoids the debug output from undesired dependencies, such as Jetty.

Run the Kafka client or Kafka CLI with the required arguments and capture the Kafka log, which should be similar to:

/var/log/kafka/kafka-broker-host-name.log

Look for the following information in the filtered logs:

Groups that the Kafka client user or CLI is running as.

Required privileges for the operation.

Retrieved privileges from Sentry.

Required and retrieved privileges comparison result.

This log information can provide insight into which privilege is not assigned to a user, causing a particular operation to fail.

If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required
notices. A copy of the Apache License Version 2.0 can be found here.