Transient errors

Some distributed database clusters make use of transient errors. A transient
error is a temporary error that is likely to disappear soon. By definition
it is safe for a client to ignore a transient error and retry the failed
operation on the same database server. The retry is free of side effects.
Clients are not forced to abort their work or to fail over to another database
server immediately. They may enter a retry loop before to wait for the
error to disappear before giving up on the database server.
Transient errors can be seen, for example, when using MySQL Cluster. But they
are not bound to any specific clustering solution per se.

PECL/mysqlnd_ms can perform an automatic retry loop in
case of a transient error. This increases distribution transparency and thus
makes it easier to migrate an application running on a single database
server to run on a cluster of database servers without having to change
the source of the application.

The automatic retry loop will repeat the requested operation up to a user
configurable number of times and pause between the attempts for a configurable
amount of time. If the error disappears during the loop, the application will
never see it. If not, the error is forwarded to the application for handling.

In the example below a duplicate key error is provoked to make the plugin
retry the failing query two times before the error is passed to the application.
Between the two attempts the plugin sleeps for 100 milliseconds.

Because the execution of the retry loop is transparent from a users point of
view, the example checks the
statistics
provided by the plugin to learn about it.

As the example shows, the plugin can be instructed to consider any error
transient regardless of the database servers error semantics. The only error
that a stock MySQL server considers temporary has the error code
1297. When configuring other error codes but
1297 make sure your configuration reflects
the semantics of your clusters error codes.

The maximum time the plugin may sleep during the retry loop depends on the
function in question. The a retry loop for query(),
prepare() or execute() will sleep for
up to max_retries * usleep_retry milliseconds.

However, functions that
control connection state
are dispatched to all connections. The retry loop settings are applied
to every connection on which the command is to be run. Thus, such a function
may interrupt program execution for longer than a function that is run
on one server only. For example, set_autocommit() is
dispatched to connections and may sleep up to
(max_retries * usleep_retry) * number_of_open_connections)
milliseconds. Please, keep this in mind when setting long sleep times
and large retry numbers. Using the default settings of
max_retries=1, usleep_retry=100 and
lazy_connections=1 it is unlikely that you will
ever see a delay of more than 1 second.