Data Transformation Flow

When you enable Kinesis Data Firehose data transformation, Kinesis Data Firehose buffers
incoming data up to 3 MB by
default. (To adjust the buffering size, use the ProcessingConfiguration API with the ProcessorParameter called BufferSizeInMBs.)
Kinesis Data Firehose then invokes the specified Lambda function asynchronously with
each buffered batch
using the AWS Lambda synchronous invocation mode. The transformed data is sent from
Lambda
to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination when
the specified destination
buffering size or buffering interval is reached, whichever happens first.

Important

The Lambda synchronous invocation mode has a payload size limit of 6 MB for both
the request and the response. Make sure that your buffering size for sending the
request to the function is less than or equal to 6 MB. Also ensure that the response
that your function returns doesn't exceed 6 MB.

Data Transformation and Status Model

All transformed records from Lambda must contain the following parameters, or Kinesis
Data Firehose
rejects them and treats that as a data transformation failure.

recordId

The record ID is passed from Kinesis Data Firehose to Lambda during the invocation.
The transformed record must contain the same record ID. Any mismatch
between the ID of the original record and the ID of the transformed
record is treated as a data transformation failure.

result

The status of the data transformation of the record. The possible values
are: Ok (the record was transformed successfully),
Dropped (the record was dropped intentionally by your
processing logic), and ProcessingFailed (the record could not
be transformed). If a record has a status of Ok or
Dropped, Kinesis Data Firehose considers it successfully processed.
Otherwise, Kinesis Data Firehose considers it unsuccessfully processed.

data

The transformed data payload, after base64-encoding.

Lambda Blueprints

There are blueprints that you can use to create a Lambda function for data
transformation. Some of these blueprints are in the AWS Lambda console and some are
in
the AWS Serverless Application Repository.

Data Transformation Failure Handling

If your Lambda function invocation fails because of a network timeout or because you've
reached the Lambda invocation limit, Kinesis Data Firehose retries the invocation
three times by default.
If the invocation does not succeed, Kinesis Data Firehose then skips that batch of
records. The skipped
records are treated as unsuccessfully processed records. You can specify or override
the
retry options using the CreateDeliveryStream or UpdateDestination
API. For this type of failure, you can log invocation errors to Amazon CloudWatch
Logs. For more
information, see Monitoring Kinesis Data Firehose Using CloudWatch Logs.

If the status of the data transformation of a record is ProcessingFailed,
Kinesis Data Firehose treats the record as unsuccessfully processed. For this type
of failure, you can
emit error logs to Amazon CloudWatch Logs from your Lambda function. For more information,
see Accessing Amazon CloudWatch Logs for
AWS Lambda in the AWS Lambda Developer Guide.

If data transformation fails, the unsuccessfully processed records are delivered to
your S3 bucket in the processing-failed folder. The records have
the following format:

The time that Kinesis Data Firehose stopped attempting Lambda invocations.

rawData

The base64-encoded record data.

lambdaArn

The Amazon Resource Name (ARN) of the Lambda function.

Duration of a Lambda Invocation

Kinesis Data Firehose supports a Lambda invocation time of up to 5 minutes. If your
Lambda function
takes more than 5 minutes to complete, you get the following error: Firehose
encountered timeout errors when calling AWS Lambda. The maximum supported function
timeout is 5 minutes.

Source Record Backup

Kinesis Data Firehose can back up all untransformed records to your S3 bucket concurrently
while
delivering transformed records to the destination. You can enable source record backup
when you create or update your delivery stream. You cannot disable source record backup
after you enable it.

Javascript is disabled or is unavailable in your
browser.

To use the AWS Documentation, Javascript must be
enabled. Please refer to your browser's Help pages for instructions.