22.6.11 NDB Cluster Replication Conflict Resolution

When using a replication setup involving multiple masters
(including circular replication), it is possible that different
masters may try to update the same row on the slave with different
data. Conflict resolution in NDB Cluster Replication provides a
means of resolving such conflicts by permitting a user-defined
resolution column to be used to determine whether or not an update
on a given master should be applied on the slave.

Some types of conflict resolution supported by NDB Cluster
(NDB$OLD(), NDB$MAX(),
NDB$MAX_DELETE_WIN()) implement this
user-defined column as a “timestamp” column (although
its type cannot be TIMESTAMP, as
explained later in this section). These types of conflict
resolution are always applied a row-by-row basis rather than a
transactional basis. The epoch-based conflict resolution functions
NDB$EPOCH() and
NDB$EPOCH_TRANS() compare the order in which
epochs are replicated (and thus these functions are
transactional). Different methods can be used to compare
resolution column values on the slave when conflicts occur, as
explained later in this section; the method used can be set on a
per-table basis.

You should also keep in mind that it is the application's
responsibility to ensure that the resolution column is correctly
populated with relevant values, so that the resolution function
can make the appropriate choice when determining whether to apply
an update.

Requirements.
Preparations for conflict resolution must be made on both the
master and the slave. These tasks are described in the following
list:

On the master writing the binary logs, you must determine
which columns are sent (all columns or only those that have
been updated). This is done for the MySQL Server as a whole by
applying the mysqld startup option
--ndb-log-updated-only
(described later in this section) or on a per-table basis by
entries in the mysql.ndb_replication table
(see The ndb_replication system table).

Note

If you are replicating tables with very large columns (such
as TEXT or
BLOB columns),
--ndb-log-updated-only can
also be useful for reducing the size of the master and slave
binary logs and avoiding possible replication failures due
to exceeding
max_allowed_packet.

On the slave, you must determine which type of conflict
resolution to apply (“latest timestamp wins”,
“same timestamp wins”, “primary
wins”, “primary wins, complete
transaction”, or none). This is done using the
mysql.ndb_replication system table, on a
per-table basis (see
The ndb_replication system table).

NDB Cluster also supports read conflict detection, that is,
detecting conflicts between reads of a given row in one
cluster and updates or deletes of the same row in another
cluster. This requires exclusive read locks obtained by
setting
ndb_log_exclusive_reads equal
to 1 on the slave. All rows read by a conflicting read are
logged in the exceptions table. For more information, see
Read conflict detection and resolution.

When using the functions NDB$OLD(),
NDB$MAX(), and
NDB$MAX_DELETE_WIN() for timestamp-based
conflict resolution, we often refer to the column used for
determining updates as a “timestamp” column. However,
the data type of this column is never
TIMESTAMP; instead, its data type
should be INT
(INTEGER) or
BIGINT. The
“timestamp” column should also be
UNSIGNED and NOT NULL.

The NDB$EPOCH() and
NDB$EPOCH_TRANS() functions discussed later in
this section work by comparing the relative order of replication
epochs applied on a primary and secondary NDB Cluster, and do not
make use of timestamps.

Master column control.
We can see update operations in terms of “before”
and “after” images—that is, the states of the
table before and after the update is applied. Normally, when
updating a table with a primary key, the “before”
image is not of great interest; however, when we need to
determine on a per-update basis whether or not to use the
updated values on a replication slave, we need to make sure that
both images are written to the master's binary log. This is
done with the
--ndb-log-update-as-write option
for mysqld, as described later in this
section.

Important

Whether logging of complete rows or of updated columns only is
done is decided when the MySQL server is started, and cannot be
changed online; you must either restart
mysqld, or start a new
mysqld instance with different logging
options.

--ndb-log-update-as-write Option: Logging Changed Data as Updates

The setting of the MySQL Server's
--ndb-log-update-as-write option
determines whether logging is performed with or without the
“before” image. Because conflict resolution is done
in the MySQL Server's update handler, it is necessary to
control logging on the master such that updates are updates and
not writes; that is, such that updates are treated as changes in
existing rows rather than the writing of new rows (even though
these replace existing rows). This option is turned on by default;
in other words, updates are treated as writes. (That is, updates
are by default written as write_row events in
the binary log, rather than as update_row
events.)

Conflict resolution control.
Conflict resolution is usually enabled on the server where
conflicts can occur. Like logging method selection, it is
enabled by entries in the
mysql.ndb_replication table.

The ndb_replication system table.
To enable conflict resolution, it is necessary to create an
ndb_replication table in the
mysql system database on the master, the
slave, or both, depending on the conflict resolution type and
method to be employed. This table is used to control logging and
conflict resolution functions on a per-table basis, and has one
row per table involved in replication.
ndb_replication is created and filled with
control information on the server where the conflict is to be
resolved. In a simple master-slave setup where data can also be
changed locally on the slave this will typically be the slave.
In a more complex master-master (2-way) replication schema this
will usually be all of the masters involved. Each row in
mysql.ndb_replication corresponds to a table
being replicated, and specifies how to log and resolve conflicts
(that is, which conflict resolution function, if any, to use)
for that table. The definition of the
mysql.ndb_replication table is shown here:

db.
The name of the database containing the table to be replicated.
You may employ either or both of the wildcards
_ and % as part of the
database name. Matching is similar to what is implemented for
the LIKE operator.

table_name.
The name of the table to be replicated. The table name may
include either or both of the wildcards _ and
%. Matching is similar to what is implemented
for the LIKE operator.

server_id.
The unique server ID of the MySQL instance (SQL node) where the
table resides.

binlog_type.
The type of binary logging to be employed. This is determined as
shown in the following table:

NULL: Indicates that conflict resolution is
not to be used for the corresponding table.

These functions are described in the next few paragraphs.

NDB$OLD(column_name).
If the value of column_name is the
same on both the master and the slave, then the update is
applied; otherwise, the update is not applied on the slave and
an exception is written to the log. This is illustrated by the
following pseudocode:

This function can be used for “same value wins”
conflict resolution. This type of conflict resolution ensures that
updates are not applied on the slave from the wrong master.

Important

The column value from the master's “before”
image is used by this function.

NDB$MAX(column_name).
If the “timestamp” column value for a given row
coming from the master is higher than that on the slave, it is
applied; otherwise it is not applied on the slave. This is
illustrated by the following pseudocode:

This function can be used for “greatest timestamp
wins” conflict resolution. This type of conflict resolution
ensures that, in the event of a conflict, the version of the row
that was most recently updated is the version that persists.

Important

The column value from the master's “after”
image is used by this function.

NDB$MAX_DELETE_WIN().
This is a variation on NDB$MAX(). Due to the
fact that no timestamp is available for a delete operation, a
delete using NDB$MAX() is in fact processed
as NDB$OLD. However, for some use cases, this
is not optimal. For NDB$MAX_DELETE_WIN(), if
the “timestamp” column value for a given row adding
or updating an existing row coming from the master is higher
than that on the slave, it is applied. However, delete
operations are treated as always having the higher value. This
is illustrated in the following pseudocode:

This function can be used for “greatest timestamp, delete
wins” conflict resolution. This type of conflict resolution
ensures that, in the event of a conflict, the version of the row
that was deleted or (otherwise) most recently updated is the
version that persists.

Note

As with NDB$MAX(), the column value from the
master's “after” image is the value used by
this function.

NDB$EPOCH() and NDB$EPOCH_TRANS().
The NDB$EPOCH() function tracks the order in
which replicated epochs are applied on a slave NDB Cluster
relative to changes originating on the slave. This relative
ordering is used to determine whether changes originating on the
slave are concurrent with any changes that originate locally,
and are therefore potentially in conflict.

Most of what follows in the description of
NDB$EPOCH() also applies to
NDB$EPOCH_TRANS(). Any exceptions are noted in
the text.

NDB$EPOCH() is asymmetric, operating on one NDB
Cluster in a two-cluster circular replication configuration
(sometimes referred to as “active-active”
replication). We refer here to cluster on which it operates as the
primary, and the other as the secondary. The slave on the primary
is responsible for detecting and handling conflicts, while the
slave on the secondary is not involved in any conflict detection
or handling.

When the slave on the primary detects conflicts, it injects events
into its own binary log to compensate for these; this ensures that
the secondary NDB Cluster eventually realigns itself with the
primary and so keeps the primary and secondary from diverging.
This compensation and realignment mechanism requires that the
primary NDB Cluster always wins any conflicts with the
secondary—that is, that the primary's changes are
always used rather than those from the secondary in event of a
conflict. This “primary always wins” rule has the
following implications:

Operations that change data, once committed on the primary,
are fully persistent and will not be undone or rolled back by
conflict detection and resolution.

Data read from the primary is fully consistent. Any changes
committed on the Primary (locally or from the slave) will not
be reverted later.

Operations that change data on the secondary may later be
reverted if the primary determines that they are in conflict.

Individual rows read on the secondary are self-consistent at
all times, each row always reflecting either a state committed
by the secondary, or one committed by the primary.

Sets of rows read on the secondary may not necessarily be
consistent at a given single point in time. For
NDB$EPOCH_TRANS(), this is a transient
state; for NDB$EPOCH(), it can be a
persistent state.

Assuming a period of sufficient length without any conflicts,
all data on the secondary NDB Cluster (eventually) becomes
consistent with the primary's data.

NDB$EPOCH() and
NDB$EPOCH_TRANS() do not require any user
schema modifications, or application changes to provide conflict
detection. However, careful thought must be given to the schema
used, and the access patterns used, to verify that the complete
system behaves within specified limits.

Each of the NDB$EPOCH() and
NDB$EPOCH_TRANS() functions can take an
optional parameter; this is the number of bits to use to represent
the lower 32 bits of the epoch, and should be set to no less than

For the default values of these configuration parameters (2000 and
100 milliseconds, respectively), this gives a value of 5 bits, so
the default value (6) should be sufficient, unless other values
are used for
TimeBetweenGlobalCheckpoints,
TimeBetweenEpochs, or
both. A value that is too small can result in false positives,
while one that is too large could lead to excessive wasted space
in the database.

Both NDB$EPOCH() and
NDB$EPOCH_TRANS() insert entries for
conflicting rows into the relevant exceptions tables, provided
that these tables have been defined according to the same
exceptions table schema rules as described elsewhere in this
section (see NDB$OLD(column_name)).
You need to create any exceptions table before creating the table
with which it is to be used.

As with the other conflict detection functions discussed in this
section, NDB$EPOCH() and
NDB$EPOCH_TRANS() are activated by including
relevant entries in the mysql.ndb_replication
table (see The ndb_replication system table).
The roles of the primary and secondary NDB Clusters in this
scenario are fully determined by
mysql.ndb_replication table entries.

Because the conflict detection algorithms employed by
NDB$EPOCH() and
NDB$EPOCH_TRANS() are asymmetric, you must use
different values for the primary slave's and secondary
slave's server_id entries.

A conflict between DELETE operations alone is
not sufficient to trigger a conflict using
NDB$EPOCH() or
NDB$EPOCH_TRANS(), and the relative placement
within epochs does not matter. (Bug #18459944)

Conflict detection status variables.
Several status variables can be used to monitor conflict
detection. You can see how many rows have been found in conflict
by NDB$EPOCH() since this slave was last
restarted from the current value of the
Ndb_conflict_fn_epoch system
status variable.

Limitations on NDB$EPOCH().
The following limitations currently apply when using
NDB$EPOCH() to perform conflict detection:

Conflicts are detected using NDB Cluster epoch boundaries,
with granularity proportional to
TimeBetweenEpochs
(default: 100 milliseconds). The minimum conflict window is
the minimum time during which concurrent updates to the same
data on both clusters always report a conflict. This is always
a nonzero length of time, and is roughly proportional to
2 * (latency + queueing +
TimeBetweenEpochs). This implies that—assuming
the default for
TimeBetweenEpochs and
ignoring any latency between clusters (as well as any queuing
delays)—the minimum conflict window size is
approximately 200 milliseconds. This minimum window should be
considered when looking at expected application
“race” patterns.

Additional storage is required for tables using the
NDB$EPOCH() and
NDB$EPOCH_TRANS() functions; from 1 to 32
bits extra space per row is required, depending on the value
passed to the function.

Conflicts between delete operations may result in divergence
between the primary and secondary. When a row is deleted on
both clusters concurrently, the conflict can be detected, but
is not recorded, since the row is deleted. This means that
further conflicts during the propagation of any subsequent
realignment operations will not be detected, which can lead to
divergence.

Deletes should be externally serialized, or routed to one
cluster only. Alternatively, a separate row should be updated
transactionally with such deletes and any inserts that follow
them, so that conflicts can be tracked across row deletes.
This may require changes in applications.

Only two NDB Clusters in a circular
“active-active” configuration are currently
supported when using NDB$EPOCH() or
NDB$EPOCH_TRANS() for conflict detection.

Tables having BLOB or
TEXT columns are not currently
supported with NDB$EPOCH() or
NDB$EPOCH_TRANS().

NDB$EPOCH_TRANS(). NDB$EPOCH_TRANS() extends the
NDB$EPOCH() function. Conflicts are detected
and handled in the same way using the “primary wins
all” rule (see
NDB$EPOCH() and NDB$EPOCH_TRANS()) but with
the extra condition that any other rows updated in the same
transaction in which the conflict occurred are also regarded as
being in conflict. In other words, where
NDB$EPOCH() realigns individual conflicting
rows on the secondary, NDB$EPOCH_TRANS()
realigns conflicting transactions.

In addition, any transactions which are detectably dependent on a
conflicting transaction are also regarded as being in conflict,
these dependencies being determined by the contents of the
secondary cluster's binary log. Since the binary log contains
only data modification operations (inserts, updates, and deletes),
only overlapping data modifications are used to determine
dependencies between transactions.

NDB$EPOCH_TRANS() is subject to the same
conditions and limitations as NDB$EPOCH(), and
in addition requires that Version 2 binary log row events are used
(--log-bin-use-v1-row-events equal
to 0), which adds a storage overhead of 2 bytes per event in the
binary log. In addition, all transaction IDs must be recorded in
the secondary's binary log
(--ndb-log-transaction-id option),
which adds a further variable overhead (up to 13 bytes per row).

Status information.
A server status variable
Ndb_conflict_fn_max provides a
count of the number of times that a row was not applied on the
current SQL node due to “greatest timestamp wins”
conflict resolution since the last time that
mysqld was started.

The number of times that a row was not applied as the result of
“same timestamp wins” conflict resolution on a given
mysqld since the last time it was restarted is
given by the global status variable
Ndb_conflict_fn_old. In addition
to incrementing
Ndb_conflict_fn_old, the primary
key of the row that was not used is inserted into an
exceptions table, as
explained later in this section.

NDB$EPOCH2().
The NDB$EPOCH2() function is similar to
NDB$EPOCH(), except that
NDB$EPOCH2() provides for delete-delete
handling with a circular replication
(“master-master”) topology. In this scenario,
primary and secondary roles are assigned to the two masters by
setting the
ndb_slave_conflict_role system
variable to the appropriate value on each master (usually one
each of PRIMARY,
SECONDARY). When this is done, modifications
made by the secondary are reflected by the primary back to the
secondary which then conditionally applies them.

NDB$EPOCH2_TRANS(). NDB$EPOCH2_TRANS() extends the
NDB$EPOCH2() function. Conflicts are detected
and handled in the same way, and assigning primary and secondary
roles to the replicating clusters, but with the extra condition
that any other rows updated in the same transaction in which the
conflict occurred are also regarded as being in conflict. That
is, NDB$EPOCH2() realigns individual
conflicting rows on the secondary, while
NDB$EPOCH_TRANS() realigns conflicting
transactions.

Where NDB$EPOCH() and
NDB$EPOCH_TRANS() use metadata that is
specified per row, per last modified epoch, to determine on the
primary whether an incoming replicated row change from the
secondary is concurrent with a locally committed change;
concurrent changes are regarded as conflicting, with subesequent
exceptions table updates and realignment of the secondary. A
problem arises when a row is deleted on the primary so there is no
longer any last-modified epoch available to determine whether any
replicated operations conflict, which means that conflicting
delete operationss are not detected. This can result in
divergence, an example being a delete on one cluster which is
concurrent with a delete and insert on the other; this why delete
operations can be routed to only one cluster when using
NDB$EPOCH() and
NDB$EPOCH_TRANS().

NDB$EPOCH2() bypasses the issue just
described—storing information about deleted rows on the
PRIMARY—by ignoring any delete-delete conflict, and by
avoiding any potential resultant divergence as well. This is
accomplished by reflecting any operation successfully applied on
and replicated from the secondary back to the secondary. On its
return to the secondary, it can be used to reapply an operation on
the secondary which was deleted by an operation originating from
the primary.

When using NDB$EPOCH2(), you should keep in
mind that the secondary applies the delete from the primary,
removing the new row until it is restored by a reflected
operation. In theory, the subsequent insert or update on the
secondary conflicts with the delete from the primary, but in this
case, we choose to ignore this and allow the secondary to
“win”, in the interest of preventing divergence
between the clusters. In other words, after a delete, the primary
does not detect conflicts, and instead adopts the secondary's
following changes immediately. Because of this, the
secondary's state can revisit multiple previous committed
states as it progresses to a final (stable) state, and some of
these may be visible.

You should also be aware that reflecting all operations from the
secondary back to the primary increases the size of the primary's
logbinary log, as well as demands on bandwidth, CPU usage, and
disk I/O.

Application of reflected operations on the secondary depends on
the state of the target row on the secondary. Whether or not
reflected changes are applied on the secondary can be tracked by
checking the
Ndb_conflict_reflected_op_prepare_count
and
Ndb_conflict_reflected_op_discard_count
status variables. The number of changes applied is simply the
difference between these two values (note that
Ndb_conflict_reflected_op_prepare_count is
always greater than or equal to
Ndb_conflict_reflected_op_discard_count).

Events are applied if and only if both of the following conditions
are true:

The existence of the row—that is, whether or not it
exists—is in accordance with the type of event. For
delete and update operations, the row must already exist; for
insert operations, the row must not
exist.

The row was last modified by the primary. It is possible that
the modification was accomplished through the execution of a
reflected operation.

If both of the conditions are not met, the reflected operation is
discarded by the secondary.

Conflict resolution exceptions table.
To use the NDB$OLD() conflict resolution
function, it is also necessary to create an exceptions table
corresponding to each NDB table for
which this type of conflict resolution is to be employed. This
is also true when using NDB$EPOCH() or
NDB$EPOCH_TRANS(). The name of this table is
that of the table for which conflict resolution is to be
applied, with the string $EX appended. (For
example, if the name of the original table is
mytable, the name of the corresponding
exceptions table name should be mytable$EX.)
The syntax for creating the exceptions table is as shown here:

The first four columns are required. The names of the first four
columns and the columns matching the original table's primary
key columns are not critical; however, we suggest for reasons of
clarity and consistency, that you use the names shown here for the
server_id, master_server_id,
master_epoch, and count
columns, and that you use the same names as in the original table
for the columns matching those in the original table's
primary key.

If the exceptions table uses one or more of the optional columns
NDB$OP_TYPE, NDB$CFT_CAUSE,
or NDB$ORIG_TRANSID discussed later in this
section, then each of the required columns must also be named
using the prefix NDB$. If desired, you can use
the NDB$ prefix to name the required columns
even if you do not define any optional columns, but in this case,
all four of the required columns must be named using the prefix.

Following these columns, the columns making up the original
table's primary key should be copied in the order in which
they are used to define the primary key of the original table. The
data types for the columns duplicating the primary key columns of
the original table should be the same as (or larger than) those of
the original columns. A subset of the primary key columns may be
used.

Regardless of the NDB Cluster version employed, the exceptions
table must use the NDB storage
engine. (An example that uses NDB$OLD() with an
exceptions table is shown later in this section.)

Additional columns may optionally be defined following the copied
primary key columns, but not before any of them; any such extra
columns cannot be NOT NULL. NDB Cluster
supports three additional, predefined optional columns
NDB$OP_TYPE, NDB$CFT_CAUSE,
and NDB$ORIG_TRANSID, which are described in
the next few paragraphs.

NDB$OP_TYPE: This column can be used to obtain
the type of operation causing the conflict. If you use this
column, define it as shown here:

The WRITE_ROW, UPDATE_ROW,
and DELETE_ROW operation types represent
user-initiated operations. REFRESH_ROW
operations are operations generated by conflict resolution in
compensating transactions sent back to the originating cluster
from the cluster that detected the conflict.
READ_ROW operations are user-initiated read
tracking operations defined with exclusive row locks.

NDB$CFT_CAUSE: You can define an optional
column NDB$CFT_CAUSE which provides the cause
of the registered conflict. This column, if used, is defined as
shown here:

ROW_DOES_NOT_EXIST can be reported as the cause
for UPDATE_ROW and WRITE_ROW
operations; ROW_ALREADY_EXISTS can be reported
for WRITE_ROW events.
DATA_IN_CONFLICT is reported when a row-based
conflict function detects a conflict;
TRANS_IN_CONFLICT is reported when a
transactional conflict function rejects all of the operations
belonging to a complete transaction.

NDB$ORIG_TRANSID: The
NDB$ORIG_TRANSID column, if used, contains the
ID of the originating transaction. This column should be defined
as follows:

NDB$ORIG_TRANSID BIGINT UNSIGNED NOT NULL

NDB$ORIG_TRANSID is a 64-bit value generated by
NDB. This value can be used to correlate
multiple exceptions table entries belonging to the same
conflicting transaction from the same or different exceptions
tables.

Additional reference columns which are not part of the original
table's primary key can be named
colname$OLD or
colname$NEW.
colname$OLD
references old values in update and delete operations—that
is, operations containing DELETE_ROW events.
colname$NEW can be
used to reference new values in insert and update
operations—in other words, operations using
WRITE_ROW events, UPDATE_ROW
events, or both types of events. Where a conflicting operation
does not supply a value for a given non-primary-key reference
column, the exceptions table row contains either
NULL, or a defined default value for that
column.

Important

The mysql.ndb_replication table is read when
a data table is set up for replication, so the row corresponding
to a table to be replicated must be inserted into
mysql.ndb_replicationbefore the table to be replicated is
created.

Inserting a 0 into the
server_id indicates that all
SQL nodes accessing this table should use conflict resolution.
If you want to use conflict resolution on a specific
mysqld only, use the actual server ID.

Inserting NULL into the
binlog_type column has the same effect as
inserting 0 (NBT_DEFAULT); the server
default is used.

Possible values for the binlog_type column
are shown earlier in this section. The value
'NDB$OLD(mycol)' should be inserted into
the conflict_fn column.

Create an appropriate exceptions table for
test.t2. The table creation statement shown
here includes all required columns; any additional columns
must be declared following these columns, and before the
definition of the table's primary key.

We can include additional columns for information about the
type, cause, and originating transaction ID for a given
conflict. We are also not required to supply matching columns
for all primary key columns in the original table. This means
you can create the exceptions table like this:

The NDB$ prefix is required for the four
required columns since we included at least one of the
columns NDB$OP_TYPE,
NDB$CFT_CAUSE, or
NDB$ORIG_TRANSID in the table definition.

Create the table test.t2 as shown
previously.

These steps must be followed for every table for which you wish to
perform conflict resolution using NDB$OLD().
For each such table, there must be a corresponding row in
mysql.ndb_replication, and there must be an
exceptions table in the same database as the table being
replicated.

Read conflict detection and resolution.
NDB Cluster also supports tracking of read operations, which
makes it possible in circular replication setups to manage
conflicts between reads of a given row in one cluster and
updates or deletes of the same row in another. This example uses
employee and department
tables to model a scenario in which an employee is moved from
one department to another on the master cluster (which we refer
to hereafter as cluster A) while the slave
cluster (hereafter B) updates the employee
count of the employee's former department in an interleaved
transaction.

We assume that we are already using an exceptions table that
includes the four required columns (and these are used for this
table's primary key), the optional columns for operation type
and cause, and the original table's primary key column,
created using the SQL statement shown here:

Suppose there occur the two simultaneous transactions on the two
clusters. On cluster A, we create a new
department, then move employee number 999 into that department,
using the following SQL statements:

At the same time, on cluster B, another
transaction reads from employee, as shown here:

BEGIN;
SELECT name FROM employee WHERE id = 999;
UPDATE department SET members = members - 1 WHERE id = 3;
commit;

The conflicting transactions are not normally detected by the
conflict resolution mechanism, since the conflict is between a
read (SELECT) and an update operation. You can
circumvent this issue by executing
SETndb_log_exclusive_reads= 1 on the slave cluster. Acquiring exclusive
read locks in this way causes any rows read on the master to be
flagged as needing conflict resolution on the slave cluster. If we
enable exclusive reads in this way prior to the logging of these
transactions, the read on cluster B is
tracked and sent to cluster A for resolution;
the conflict on the employee row will be detected and the
transaction on cluster B is aborted.

The conflict is registered in the exceptions table (on cluster
A) as a READ_ROW operation
(see Conflict resolution exceptions table,
for a description of operation types), as shown here:

Any existing rows found in the read operation are flagged. This
means that multiple rows resulting from the same conflict may be
logged in the exception table, as shown by examining the effects a
conflict between an update on cluster A and a
read of multiple rows on cluster B from the
same table in simultaneous transactions. The transaction executed
on cluster A is shown here:

SET ndb_log_exclusive_reads = 1; # Must be set if not already enabled
...
BEGIN;
SELECT COUNT(*) INTO @count FROM employee WHERE dept = 3 FOR UPDATE;
UPDATE department SET members = @count WHERE id = 3;
COMMIT;

In this case, all three rows matching the WHERE
condition in the second transaction's
SELECT are read, and are thus flagged in the
exceptions table, as shown here:

Read tracking is performed on the basis of existing rows only. A
read based on a given condition track conflicts only of any rows
that are found and not of any rows that are
inserted in an interleaved transaction. This is similar to how
exclusive row locking is performed in a single instance of NDB
Cluster.