Retry Failed Activities

Activities sometimes fail for ephemeral reasons, such as a temporary loss of connectivity.
At another time,
the activity might succeed, so the appropriate way to handle activity failure is often
to retry the activity,
perhaps multiple times.

There are a variety of strategies for retrying activities; the best one depends on
the details of your
workflow. The strategies fall into three basic categories:

The retry-until-success strategy simply keeps retrying the activity until it completes.

The exponential retry strategy increases the time interval between retry attempts
exponentially until the
activity completes or the process reaches a specified stopping point, such as a maximum
number of
attempts.

The custom retry strategy decides whether or how to retry the activity after each
failed attempt.

The following sections describe how to implement these strategies. The example workflow
workers all use a
single activity, unreliableActivity, which randomly does one of following:

Completes immediately

Fails intentionally by exceeding the timeout value

Fails intentionally by throwing IllegalStateException

Retry-Until-Success Strategy

The simplest retry strategy is to keep retrying the activity each time it fails until
it eventually
succeeds. The basic pattern is:

Implement a nested TryCatch or TryCatchFinally class in
your workflow's entry point method.

Execute the activity in doTry

If the activity fails, the framework calls doCatch, which runs the
entry point method again.

Repeat Steps 2 - 3 until the activity completes successfully.

The following workflow implements the retry-until-success strategy. The workflow interface
is implemented in
RetryActivityRecipeWorkflow and has one method, runUnreliableActivityTillSuccess,
which is the workflow's entry point. The workflow worker is implemented in
RetryActivityRecipeWorkflowImpl, as follows:

runUnreliableActivityTillSuccess creates a Settable<Boolean> object named
retryActivity which is used to indicate whether the activity failed and should be retried.
Settable<T> is derived from Promise<T> and works much the same way, but
you set a Settable<T> object's value manually.

runUnreliableActivityTillSuccess implements an anonymous nested TryCatch class
to handle any exceptions that are thrown by the unreliableActivity activity. For more discussion
of how to handle exceptions thrown by asynchronous code, see Error Handling.

doTry executes the unreliableActivity activity, which returns a
Promise<Void> object named activityRanSuccessfully.

doTry calls the asynchronous setRetryActivityToFalse method, which has two
parameters:

activityRanSuccessfully takes the Promise<Void> object returned by the
unreliableActivity activity.

If unreliableActivity throws an exception, the framework calls doCatch and
passes it the exception object. doCatch sets retryActivity to true.

runUnreliableActivityTillSuccess calls the asynchronous
restartRunUnreliableActivityTillSuccess method and passes it the retryActivity
object. Because retryActivity is a Promise<T> type,
restartRunUnreliableActivityTillSuccess defers execution until retryActivity is
ready, which occurs after TryCatch completes.

When retryActivity is ready, restartRunUnreliableActivityTillSuccess extracts
the value.

If the value is false, the retry succeeded.
restartRunUnreliableActivityTillSuccess doesn'thing and the retry sequence
terminates.

If the value is true, the retry failed. restartRunUnreliableActivityTillSuccess calls
runUnreliableActivityTillSuccess to execute the activity again.

Steps 1 - 7 repeat until unreliableActivity completes.

Note

doCatch doesn't handle the exception; it simply sets the retryActivity object
to true to indicate that the activity failed. The retry is handled by the asynchronous
restartRunUnreliableActivityTillSuccess method, which defers execution until
TryCatch completes. The reason for this approach is that, if you retry an activity in
doCatch, you can't cancel it. Retrying the activity in
restartRunUnreliableActivityTillSuccess allows you to execute cancellable activities.

Exponential Retry Strategy

With the exponential retry strategy, the framework executes a failed activity again
after a specified period
of time, N seconds. If that attempt fails the framework executes the activity again
after 2N seconds, and then
4N seconds and so on. Because the wait time can get quite large, you typically stop
the retry attempts at some
point rather than continuing indefinitely.

The framework provides three ways to implement an exponential retry strategy:

The @ExponentialRetry annotation is the simplest approach, but you must set the retry
configuration options at compile time.

The RetryDecorator class allows you to set retry configuration at run time and change it as
needed.

The AsyncRetryingExecutor class allows you to set retry configuration at run time and
change it as needed. In addition, the framework calls a user-implemented AsyncRunnable.run
method to run each retry attempt.

All approaches support the following configuration options, where time values are
in seconds:

The initial retry wait time.

The back-off coefficient, which is used to compute the retry intervals, as follows:

The expiration time. Retry attempts stop when the total duration of the process exceeds
this value. The
default value is unlimited.

The exceptions that will trigger the retry process. By default, every exception triggers
the retry
process.

The exceptions that will not trigger a retry attempt. By default, no exceptions are
excluded.

The following sections describe the various ways that you can implement an exponential
retry strategy.

Exponential Retry with @ExponentialRetry

The simplest way to implement an exponential retry strategy for an activity is to
apply an
@ExponentialRetry annotation to the activity in the interface definition. If the activity fails,
the framework handles the retry process automatically, based on the specified option
values. The basic pattern
is:

Apply @ExponentialRetry to the appropriate activities and specify the retry
configuration.

If an annotated activity fails, the framework automatically retries the activity according
to the
configuration specified by the annotation's arguments.

The ExponentialRetryAnnotationWorkflow workflow worker implements the exponential retry
strategy by using an @ExponentialRetry annotation. It uses an unreliableActivity
activity whose interface definition is implemented in ExponentialRetryAnnotationActivities, as
follows:

The workflow interface is implemented in RetryWorkflow and has one method,
process, which is the workflow's entry point. The workflow worker is implemented in
ExponentialRetryAnnotationWorkflowImpl, as follows:

If the activity fails by throwing IllegalStateException, the framework automatically runs the
retry strategy specified in ExponentialRetryAnnotationActivities.

Exponential Retry with the RetryDecorator Class

@ExponentialRetry is simple to use. However, the configuration is static and set at compile
time, so the framework uses the same retry strategy every time the activity fails.
You can implement a more
flexible exponential retry strategy by using the RetryDecorator class, which allows you to
specify the configuration at run time and change it as needed. The basic pattern is:

Create and configure an ExponentialRetryPolicy object that specifies the retry
configuration.

Create a RetryDecorator object and pass the ExponentialRetryPolicy object
from Step 1 to the constructor.

Apply the decorator object to the activity by passing the activity client's class
name to the
RetryDecorator object's decorate method.

Execute the activity.

If the activity fails, the framework retries the activity according to the
ExponentialRetryPolicy object's configuration. You can change the retry configuration as needed
by modifying this object.

Note

The @ExponentialRetry annotation and the RetryDecorator class are mutually
exclusive. You can't use RetryDecorator to dynamically override a retry policy specified by an
@ExponentialRetry annotation.

The following workflow implementation shows how to use the RetryDecorator class to implement
an exponential retry strategy. It uses an unreliableActivity activity that doesn't have an
@ExponentialRetry annotation. The workflow interface is implemented in RetryWorkflow
and has one method, process, which is the workflow's entry point. The workflow worker is
implemented in DecoratorRetryWorkflowImpl, as follows:

Calling the object's withMaximumAttempts method to set the maximum number of attempts
to 5. ExponentialRetryPolicy exposes other with objects that you can use
to specify other configuration options.

process creates a RetryDecorator object named retryDecorator
and passes the ExponentialRetryPolicy object from Step 1 to the constructor.

process applies the decorator to the activity by calling the
retryDecorator.decorate method and passing it the activity client's class name.

handleUnreliableActivity executes the activity.

If the activity fails, the framework retries it according to the configuration specified
in Step
1.

Note

Several of the ExponentialRetryPolicy class's with methods have a
corresponding set method that you can call to modify the corresponding configuration option
at any time: setBackoffCoefficient, setMaximumAttempts,
setMaximumRetryIntervalSeconds, and setMaximumRetryExpirationIntervalSeconds.

Exponential Retry with the AsyncRetryingExecutor
Class

The RetryDecorator class provides more flexibility in configuring the retry process than
@ExponentialRetry, but the framework still runs the retry attempts automatically, based on the
ExponentialRetryPolicy object's current configuration. A more flexible approach is to use the
AsyncRetryingExecutor class. In addition to allowing you to configure the retry process at run
time, the framework calls a user-implemented AsyncRunnable.run method to run each retry attempt
instead of simply executing the activity.

The basic pattern is:

Create and configure an ExponentialRetryPolicy object to specify the retry
configuration.

Create an AsyncRetryingExecutor object, and pass it the
ExponentialRetryPolicy object and an instance of the workflow clock.

Implement an anonymous nested TryCatch or TryCatchFinally class.

Implement an anonymous AsyncRunnable class and override the run method to
implement custom code for running the activity.

Override doTry to call the AsyncRetryingExecutor object's
execute method and pass it the AsyncRunnable class from Step 4. The
AsyncRetryingExecutor object calls AsyncRunnable.run to run the activity.

If the activity fails, the AsyncRetryingExecutor object calls the
AsyncRunnable.run method again, according to the retry policy specified in Step 1.

The following workflow shows how to use the AsyncRetryingExecutor class to implement an
exponential retry strategy. It uses the same unreliableActivity activity as the
DecoratorRetryWorkflow workflow discussed earlier. The workflow interface is implemented in
RetryWorkflow and has one method, process, which is the workflow's entry point. The
workflow worker is implemented in AsyncExecutorRetryWorkflowImpl, as follows:

handleUnreliableActivity creates an AsyncRetryExecutor object,
executor, and passes the ExponentialRetryPolicy object from Step 2 and an instance
of the workflow clock to the constructor

handleUnreliableActivity implements an anonymous nested TryCatch class and
overrides the doTry and doCatch methods to run the retry attempts and handle any
exceptions.

doTry creates an anonymous AsyncRunnable class and overrides the
run method to implement custom code to execute unreliableActivity. For simplicity,
run just executes the activity, but you can implement more sophisticated approaches as
appropriate.

doTry calls executor.execute and passes it the AsyncRunnable
object. execute calls the AsyncRunnable object's run method to run
the activity.

If the activity fails, executor calls run again, according to the retryPolicy
object configuration.

Custom Retry Strategy

The most flexible approach to retrying failed activities is a custom strategy, which
recursively calls an
asynchronous method that runs the retry attempt, much like the retry-until-success
strategy. However, instead of
simply running the activity again, you implement custom logic that decides whether
and how to run each successive
retry attempt. The basic pattern is:

Create a Settable<T> status object, which is used to indicate whether the activity
failed.

Implement a nested TryCatch or TryCatchFinally class.

doTry executes the activity.

If the activity fails, doCatch sets the status object to indicate that the activity
failed.

Call an asynchronous failure handling method and pass it the status object. The method
defers execution
until TryCatch or TryCatchFinally completes.

The failure handling method decides whether to retry the activity, and if so, when.

The following workflow shows how to implement a custom retry strategy. It uses the
same
unreliableActivity activity as the DecoratorRetryWorkflow and
AsyncExecutorRetryWorkflow workflows. The workflow interface is implemented in
RetryWorkflow and has one method, process, which is the workflow's entry point. The
workflow worker is implemented in CustomLogicRetryWorkflowImpl, as follows:

callActivityWithRetry creates a Settable<Throwable> object named failure
which is used to indicate whether the activity has failed. Settable<T> is derived from
Promise<T> and works much the same way, but you set a Settable<T>
object's value manually.

callActivityWithRetry implements an anonymous nested TryCatchFinally class to
handle any exceptions that are thrown by unreliableActivity. For more discussion of how to
handle exceptions thrown by asynchronous code, see AWS Flow Framework for Java Exceptions.

doTry executes unreliableActivity.

If unreliableActivity throws an exception, the framework calls doCatch and
passes it the exception object. doCatch sets failure to the exception object,
which indicates that the activity failed and puts the object in a ready state.

doFinally checks whether failure is ready, which will be true only if
failure was set by doCatch.

If failure is ready, doFinally doesn'thing.

If failure isn't ready, the activity completed and
doFinally sets failure to null.

callActivityWithRetry calls the asynchronous retryOnFailure method and passes
it failure. Because failure is a Settable<T> type, callActivityWithRetry
defers execution until failure is ready, which occurs after TryCatchFinally completes.

retryOnFailure gets the value from failure.

If failure is set to null, the retry attempt was successful. retryOnFailure does
nothing, which terminates the retry process.

If failure is set to an exception object and shouldRetry returns true,
retryOnFailure calls callActivityWithRetry to retry the activity.

shouldRetry implements custom logic to decide whether to retry a failed activity. For
simplicity, shouldRetry always returns true and retryOnFailure
executes the activity immediately, but you can implement more sophisticated logic
as needed.

Steps 2–8 repeat until unreliableActivity completes or shouldRetry decides
to stop the process.

Note

doCatch doesn't handle the retry process; it simply sets failure to indicate that the
activity failed. The retry process is handled by the asynchronous retryOnFailure method, which
defers execution until TryCatch completes. The reason for this approach is that, if you retry an
activity in doCatch, you can't cancel it. Retrying the activity in retryOnFailure
allows you to execute cancellable activities.

Javascript is disabled or is unavailable in your browser.

To use the AWS Documentation, Javascript must be enabled. Please refer to your browser's
Help pages for instructions.