This Amazon Kinesis Data Analytics example demonstrates how to use the TOP_K_ITEMS_TUMBLING
function to retrieve the most frequently occurring values in a tumbling window. For
more
information, see TOP_K_ITEMS_TUMBLING function in the
Amazon Kinesis Data Analytics SQL Reference.

The TOP_K_ITEMS_TUMBLING function is useful when aggregating over tens
or hundreds of thousands of keys, and you want to reduce your resource usage. The
function
produces the same result as aggregating with GROUP BY and ORDER BY clauses.

In this example, you write the following records to an Amazon Kinesis data stream:

You then create a Kinesis Data Analytics application in the AWS Management Console,
with the Kinesis data stream as the
streaming source. The discovery process reads sample records on the streaming source
and
infers an in-application schema with one column (TICKER) as shown
following.

You use the application code with the TOP_K_VALUES_TUMBLING function to
create a windowed aggregation of the data. Then you insert the resulting data into
another in-application stream, as shown in the following screenshot:

In the following procedure, you create a Kinesis Data Analytics application that retrieves
the most
frequently occurring values in the input stream.

Choose Create Kinesis stream, and then create a
stream with one shard. For more information, see Create a Stream in the Amazon Kinesis Data Streams Developer
Guide.

To write records to a Kinesis data stream in a production environment, we recommend
using either the Kinesis Client Library
or Kinesis Data Streams API. For simplicity, this example uses the following Python script to generate records.
Run the code to populate the sample ticker records. This
simple code continuously writes a random ticker record to the stream. Leave the script
running so that you can generate the application schema in a later step.

On the application details page, choose Connect streaming
data to connect to the source.

On the Connect to source page, do the
following:

Choose the stream that you created in the preceding section.

Choose Discover Schema. Wait for the console to show the inferred schema and samples
records that are used to infer the schema for the in-application
stream created. The inferred schema has one column.