This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.0, the latest
stable version of Kafka. In addition, this release includes Confluent’s multi-datacenter replication and automatic data balancing tool.

Confluent Platform users are encouraged to upgrade to CP 3.1.1 as it includes both new major functionality as well as
important bug fixes. The technical details of this release are summarized below.

Confluent Enterprise now includes Auto Data Balancing. As clusters grow, topics and partitions grow in different rates, brokers are added and removed and over time this leads to unbalanced workload across datacenter resources. Some brokers are not doing much at all, while others are heavily taxed with large or many partitions, slowing down message delivery. When executed, this feature monitors your cluster for number of brokers, size of partitions, number of partitions and number of leaders within the cluster. It allows you to shift data to create an even workload across your cluster, while throttling rebalance traffic to minimize impact on production workloads while rebalancing.

Control Center added the capability to define alerts on the latency and completeness statistics of data streams, which can be delivered by email or queried from a centralized alerting system. In this release the stream monitoring features have also been extended to monitor topics from across multiple Kafka clusters and access to the Control Center can be protected via integration with enterprise authentication systems.

Interactive queries let you get more from streaming than just the processing of data. This feature allows you to treat the stream processing layer as a lightweight embedded database and, more concretely, to directly query the latest state of your stream processing application, without needing to materialize that state to external databases or external storage first.

As a result, interactive queries simplify the architecture of many use cases and lead to more application-centric architectures. For example, you often no longer need to operate and interface with a separate database cluster – or a separate infrastructure team in your company that runs that cluster – to share data between a Kafka Streams application (say, an event-driven microservice) and downstream applications, regardless of whether these applications use Kafka Streams or not; they may even be applications that do not run on the JVM, e.g. implemented in Python, C/C++, or JavaScript.

Kafka Streams applications now benefit from record caches. Notably, these caches are used to compact output records (similar to Kafka’s log compaction) so that fewer updates for the same record key are being sent downstream.
These new caches are enabled by default and typically result in reduced load on your Kafka Streams application, your Kafka cluster, and/or downstream applications and systems such as external databases. However, these caches can also be disabled, if needed, to restore the CP 3.0.x behavior of your applications.

Confluent Platform 3.0.1 contains a number of bug fixes included in the Kafka 0.10.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Control Center added performance improvements to reduce running overhead, including reducing the number of Kafka topics necessary and optimizing webclient fetches from the server. We also added the ability to delete a running connector and support for connecting to SSL/SASL secured Kafka clusters.

A common situation when implementing stream processing applications in practice is to tell an application to reprocess
its data from scratch. This may be required for a number of reasons, including but not limited to: during development
and testing, when addressing bugs in production, when doing A/B testing of algorithms and campaigns, when giving demos
to customers, and so on.

This is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.0.0, the latest
stable version of Kafka. In addition, this release includes the new Confluent Control Center application as well as the
new Kafka Streams library that ships with Apache Kafka 0.10.0.0.

Confluent Platform users are encouraged to upgrade to CP 3.0.0 as it includes both new major functionality as well as
important bug fixes. The technical details of this release are summarized below.

We have added several new features to the Confluent Platform in CP 3.0, to provide a more complete, easier to use, and
higher performacne Stream Processing Platform:

We’re very excited to introduce Kafka Streams. Kafka Streams is included in Apache Kafka 0.10.0.0. Kafka Streams
is a library that turns Apache Kafka into a full featured, modern stream processing system.
Kafka Streams includes a high level language for describing common stream operations (such
as joining, filtering, and aggregating records), allowing developers to quickly develop powerful streaming applications.
Kafka Streams applications can easily be deployed on many different systems— they can run on YARN, be deployed on
Mesos, run in Docker containers, or just embedded into exisiting Java applications.

Control Center is a web-based management and monitoring tool for Apache Kafka. In version 3.0.0, Control Center allows
you to configure, edit, and manage connectors in Kafka Connect. It also includes Stream Monitoring: a system for
measuring and monitoring your data streams end to end, from producer to consumer. To get started with Control Center,
see Installation.

A term license for Confluent Control Center is available for Confluent Platform Enterprise Subscribers, but any user
may download and try Confluent Control Center for free for 30 days.

Relative Offsets in Compressed Messages. In older versions of Kafka, recompression occurred when a broker
received a batch of messages from ther producer. In 0.10.0.0, we have changed from using absolute offsets to relative
offsets to avoid the recompression, reducing latency and reducing load on Kafka brokers.
KAFKA-2511

Rack Awareness. Kafka can now run with a rack awareness feature that isolates replicas so they are guaranteed to span
multiple racks or availability zones. This allows all of Kafka’s durability guarantees to be applied to these larger
architectural units, significantly increasing availability.
KAFKA-1215

Timestamps in Mesages. Messages are now tagged with timestamps at the time they are produced, allowing a number
of future features including looking up message by time and measuring timing.
KAFKA-2511

Kafka Consumer Max Records. In 0.9.0.0, developers had little control over the number of mesages returned when calling
poll() for the new consumer. This feature introduces a new parameter max.poll.records that allows developers to
limit the number of messages returned. KAKFA-3007

Client-Side Interceptors. We have introduced a new plugin architecture that allows developers to easily add “plugins” to
Kafka clients. This allows developers to easily deploy additional code to inspect or modify Kafka messages.
KAFKA-3162

Standardize Client Sequences. This features changed the arguments to some methods in the new consumer to work
nore consistently with Java Collections. KAFKA-3006

List Connectors REST API. You can now query a distributed Kafka Connect cluster to discover the available connector
classes. KAFKA-3316

Admin API changes. Some changes were made in the metadata request/response, improving performance in some situations.
KAFKA-1694

Protocol Version Improvements. Kafka brokers now support a request that returns all supported protocol API versions.
(This will make it easier for future Kafka clients to support multiple broker versions with a single client.)
KAFKA-3307

SASL Improvements. Kafka 0.9.0.0 introduced new security features to Kafka, including support for Kerberos through
SASL. In 0.10.0.0, Kafka now includes support for more SASL features, including external authentication servers,
supporting multiple types of SASL authentication on one server, and other improvements.
KAFKA-3149

Connect Status/Control APIs. In Kafka 0.10.0.0, we have continued to improve Kafka Connect. Previously, users had to
monitor logs to view the status of connectors and their tasks, but we now support a status API for easier monitoring.
We’ve also added control APIs, which allow you to pause a connector’s message processing in order to perform maintenance,
and to manually restart tasks which have failed.
KAFKA-3093,
KAFKA-2370,
KAFKA-3506

Allow cross origin HTTP requests on all HTTP methods. In Kafka 0.9.0.0, Kafka Connect only supported requests from
the same domain; this enhancement removes that restriction.
KAFKA-3578

Kafka LZ4 framing. Kafka’s implementation of LZ4 did not follow the standard LZ4 specification, creating
problems for third party clients that wanted to leverage existing libraries. Kafka now conforms to the standard.
KAFKA-3160

Note

Upgrading a Kafka Connect running in distributed mode from 0.9 versions of Kafka to 0.10 versions requires making a
configuration change before the upgrade. See the Kafka Connect Upgrade Notes for more details.

Camus in Confluent Platform is deprecated in Confluent Platform 3.0 and may be removed in a release after Confluent
Platform 3.1. To export data from Kafka to HDFS and Hive, we recommend
Kafka Connect with the Confluent HDFS connector as an alternative.

We have also added some additional features to the Confluent Platform in CP 3.0:

Preview release of Python Client. We’re introducing a fully supported, up to date client for
Python. Over time, we will keep this client up to date with the latest Java clients, including support for new broker
versions and Kafka features. Try it out and send us feedback, through the
Confluent Platform Mailing List.

Security for Schema Registry. The Schema Registry now supports SSL both at its REST layer
(via HTTPS) and in its communication with Kafka. The REST layer is the public, user-facing component, and the “communication
with Kafka” is the backend communication with Kafka where schemas are stored.

Security for Kafka REST Proxy. The REST Proxy now supports REST calls over HTTPS. The REST Proxy
does not currently support Kafka security.

We’ve removed the “beta” designation from the new Java consumer and encourage users to begin migration away from the
old consumers (note that it is required to make use of Kafka security extensions).

The old Scala producer has been deprecated. Users should migrate to the Java producer as soon as possible.

Confluent Platform 2.0.1 contains a number of bug fixes included in the Kafka 0.9.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Here is a quick overview of the notable Kafka-related bug fixes in the release, grouped by the affected functionality:

KAFKA-2978: Topic partition is not sometimes consumed after rebalancing of consumer group

The CP 2.0.0 release includes a range of new features over the previous release CP 1.0.x.

This release includes three key security features built directly within Kafka itself. First we now authenticate users using either Kerberos or TLS client certificates, so we now know who is making each request to Kafka. Second we have added a unix-like permissions system (ACLs) to control which users can access which data. Third, we support encryption on the wire using TLS to protect sensitive data on an untrusted network.

For more information on security features and how to enable them, see Kafka Security.

Kafka Connect facilitates large-scale, real-time data import and export for Kafka. It abstracts away common problems that each such data integration tool needs to solve to be viable for 24x7 production environments: fault tolerance, partitioning, offset management and delivery semantics, operations, and monitoring. It offers the capability to run a pool of processes that host a large number of Kafka connectors while handling load balancing and fault tolerance.

Confluent Platform includes a file connector for importing data from text files or exporting to text files, JDBC connector for importing data from relational databases and an HDFS connector for exporting data to HDFS / Hive in Avro and Parquet formats.

To learn more about Kafka Connect and the available connectors, see Kafka Connect.

Confluent Platform 2.0 and Kafka 0.9 now support user-defined quotas. Users have the ability to enforce quotas on a per-client basis. Producer-side quotas are defined in terms of bytes written per second per client id while consumer quotas are defined in terms of bytes read per second per client id.

This release introduces beta support for the newly redesigned consumer client. At a high level, the primary difference in the new consumer is that it removes the distinction between the “high-level” ZooKeeper-based consumer and the “low-level” SimpleConsumer APIs, and instead offers a unified consumer API.

The new consumer allows the use of the group management facility (like the older high-level consumer) while still offering better control over offset commits at the partition level (like the older low-level consumer). It offers pluggable partition assignment amongst the members of a consumer group and ships with several assignment strategies. This completes a series of projects done in the last few years to fully decouple Kafka clients from Zookeeper, thus entirely removing the consumer client’s dependency on ZooKeeper.

In this release of the Confluent Platform we are packaging librdkafka. librdkafka is a C/C++ library implementation of the Apache Kafka protocol, containing both Producer and Consumer support. It was designed with message delivery reliability and high performance in mind, current figures exceed 800,000 msgs/second for the producer and 3 million msgs/second for the consumer.

You can learn how to use librdkafka side-by-side with the Java clients in our Kafka Clients documentation.

Proactive Support is a component of the Confluent Platform that improves Confluent’s support for the platform by collecting and reporting support metrics (“Metrics”) to Confluent. Proactive Support is enabled by default in the Confluent Platform. We do this to provide proactive support to our customers, to help us build better products, to help customers comply with support contracts, and to help guide our marketing efforts. With Metrics enabled, a Kafka broker is configured to collect and report certain broker and cluster metadata (“Metadata”) every 24 hours about your use of the Confluent Platform (including without limitation, your remote internet protocol address) to Confluent, Inc. (“Confluent”) or its parent, subsidiaries, affiliates or service providers.

Proactive Support is enabled by default in the Confluent Platform, but you can disable it by following the instructions in Proactive Support documentation. Please refer to the Confluent Privacy Policy for an in-depth description of how Confluent processes such information.

Kafka 0.9 no longer supports Java 6 or Scala 2.9. If you are still on Java 6, consider upgrading to a supported version.

Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.

Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.

MirroMaker no longer supports multiple target clusters. As a result it will only accept a single --consumer.config parameter. To mirror multiple source clusters, you will need at least one MirrorMaker instance per source cluster, each with its own consumer configuration.