MapR Database is an enterprise-grade, high performance, NoSQL (“Not Only SQL”) database management system. You can use it to add realtime,
operational analytics capabilities to big data applications. As a multi-model NoSQL database, it supports both JSON document
models and wide column data models.

MapR Database can be used as both a document database and a wide-column database. As a document database, JSON documents are stored
in MapR Database JSON table. As a wide-column database, binary files are in stored MapR Database binary tables.

This topic describes an end-to-end flow of how to establish and use Change Data Capture (CDC). It assumes that a new table
and dataset will be created, although an existing table with data can also be used.

Change Data Capture (CDC) changed data records propagate in one direction; from a source table to a topic in a changelog
stream. One stream with one topic can be created for the changed data records or multiple streams with multiple topics
can be created.

Security for CDC is applied through Access Control Expressions (ACEs). In addition, if a secure cluster configuration
is implemented, then additional setup may be needed depending on the configuration.

Data in one table can be replicated to another table that is in the same cluster or in a separate cluster. This type of
replication is in addition to the automatic replication that occurs with table regions within a volume.

Change Data Capture

The Change Data Capture (CDC) system allows you to capture changes made to data
records in MapR Database tables (JSON or binary) and propagate them to
a MapR Event Store For Apache Kafka topic.

These data changes are the result of inserts, updates, and
deletions and are called change data records. Once the change data records are propagated to a
topic, a MapR Event Store For Apache Kafka/Kafka consumer application is used to
read and process them.

Note: The order of the records in the topic-partition is the same as the order of the
changes made to the table.
The order is retained because change data records for the same key are propagated
to the same topic-partition.

Why Use Change Data Capture?

CDC can be used in many
ways, including the following:

To track changes occuring in a MapR Database table and
perform real-time processing on the data.

Getting Started with CDC
This topic describes an end-to-end flow of how to establish and use Change Data Capture (CDC). It assumes that a new table and dataset will be created, although an existing table with data can also be used.

Data Modeling and CDC
Change Data Capture (CDC) changed data records propagate in one direction; from a source table to a topic in a changelog stream. One stream with one topic can be created for the changed data records or multiple streams with multiple topics can be created.

Security and CDC
Security for CDC is applied through Access Control Expressions (ACEs). In addition, if a secure cluster configuration is implemented, then additional setup may be needed depending on the configuration.