Tracking Amazon Kinesis Streams Application State

For each Amazon Kinesis Streams application, the KCL uses a unique Amazon DynamoDB
table to keep track of the
application's state. Because the KCL uses the name of the Amazon Kinesis Streams application
to create the name
of the table, each application name must be unique.

If the Amazon DynamoDB table for your Amazon Kinesis Streams application does not
exist when the application starts up, one
of the workers creates the table and calls the describeStream method to populate
the table. For more information, see Application State Data.

Important

Your account is charged for the costs associated with the DynamoDB table, in addition
to
the costs associated with Kinesis Streams itself.

Throughput

If your Amazon Kinesis Streams application receives provisioned-throughput exceptions,
you should increase the
provisioned throughput for the DynamoDB table. The KCL creates the table with a
provisioned throughput of 10 reads per second and 10 writes per second, but this might
not
be sufficient for your application. For example, if your Amazon Kinesis Streams application
does frequent checkpointing
or operates on a stream that is composed of many shards, you might need more
throughput.

Application State Data

Each row in the DynamoDB table represents a shard that is being processed by your
application. The hash key for the table is the shard ID.

In addition to the shard ID, each row also includes the following data:

checkpoint: The most recent checkpoint sequence
number for the shard. This value is unique across all shards in the stream.

checkpointSubSequenceNumber: When using the Kinesis
Producer Library's aggregation feature, this is an extension to checkpoint that tracks individual user records within the Kinesis
record.

leaseCounter: Used for lease versioning so that
workers can detect that their lease has been taken by another worker.

leaseKey: A unique identifier for a lease. Each
lease is particular to a shard in the stream and is held by one worker at a time.

leaseOwner: The worker that is holding this
lease.

ownerSwitchesSinceCheckpoint: How many times this
lease has changed workers since the last time a checkpoint was written.

parentShardId: Used to ensure that the parent shard
is fully processed before processing starts on the child shards. This ensures that
records are processed in the same order they were put into the stream.