Converting a transaction's implicit lock to an explicit lock is costly. It impacts
both the lock_sys_t::mutex and the trx_sys_t::mutex. The cost comes from the the
check that determines whether a transaction that inserted a new record is still
active. If it is determined to be active then an explicit record lock needs to be
created for the active transaction. The current design does the following:
Acquire the lock_sys_t::mutex
Acquire the trx_sys_t::mutex
Scan the trx_sys_t::rw_trx_list for trx_id_t (only RW transactions can insert)
Release the trx_sys_t::mutex
Return handle if transaction found
if handle found then
do an implicit to explicit record conversion
endif
Release the lock_sys_t::mutex
The above pseudo code should make it clear that as the trx_sys_t::rw_trx_list
grows it has a proportional cost on the lock_sys_t::mutex and that causes a sharp
drop in TPS at higher concurrency e.g., 1K RW threads in Sysbench.
The solution is:
Acquire the trx_sys_t::mutex
Scan the trx_sys_t::rw_trx_list for trx_id_t (only RW transactions can insert)
if handle found then
Acquire the trx_t::mutex
Increment trx_t::n_ref_count
Release the trx_t::mutex
endif
Release the trx_sys_t::mutex
Return handle if transaction found
if handle found then
Acquire the lock_sys_t::mutex
do an implicit to explicit record conversion
Release the lock_sys_t::mutex
Acquire the trx_t::mutex
Decrement trx_t::n_ref_count
Release the trx_t::mutex
endif
During commit we do the following check:
Acquire the trx_t::mutex
if trx_t::n_ref_count > 0
while (trx_t::n_ref_count > 0)
Release the trx_t::mutex
sleep/delay
Acquire the trx_t::n_ref_count
end while
endif

When converting an implicit lock to an explicit lock we first acquire the
lock_sys_t::mutex and then traverse over the trx_sys_t::rw_trx_list to check if
the transaction is active. We also acquire the trx_sys_t::mutex before we traverse
the list to maintain correctness. As the list grows this puts a lot of pressure on
the lock_sys_t::mutex. The lock_sys_t::mutex is needed to ensure that if an active
transaction instance is found it is not committed or rolled back while we are
holding a pointer to it. During commit/rollback we will acquire the
lock_sys_t::mutex before we release any locks.
The fix is to use a reference count on the trx_t instance. When we find a
transaction in the trx_sys_t::rw_trx_list we will acquire the trx_t::mutex
andincrement the reference count.
In lock_rec_convert_impl_to_expl() we will first check and increment the reference
count of the transaction if found. Only if the transaction is found do we need to
acquire the lock_sys_t::mutex. Later after converting the implicit lock to an
explicit lock we decrement the reference count.
In lock_trx_release_locks() after changing the trx_t state
TRX_STATE_COMMITTED_IN_MEMORY we will wait on this ref count to become zero before
we release any locks.
Use the same technique in row_vers_impl_x_locked_low(), modify it so that it
returns the trx instance and not the trx_id. This should save excessive scanning
of the rw_trx_list.

Add a reference count field to trx_t:
ulint n_ref_count; /*!< Count of references, protected
by trx_t::mutex. We can't release the
locks nor commit the transaction until
this reference is 0. We can change
the state to COMMITTED_IN_MEMORY to
signify that it is no longer
"active". */
Add 3 new functions that use this field:
**
Increase the reference count. If the transaction is in state
TRX_STATE_COMMITTED_IN_MEMORY then the transaction is considered
committed and the reference count is not incremented.
@param trx Transaction that is being referenced
@param do_ref_count Increment the reference iff this is true
@return transaction instance if it is not committed */
UNIV_INLINE
trx_t*
trx_reference(
trx_t* trx,
bool do_ref_count);
/**
Release the transaction. Decrease the reference count.
@param trx Transaction that is being released */
UNIV_INLINE
void
trx_release_reference(
trx_t* trx);
/**
Check if the transaction is being referenced. */
#define trx_is_referenced(t) ((t)->n_ref_count > 0)
When a running transaction needs to do an implicit to explicit conversion record
conversion for another transaction it has to first check whether the other
transaction that owns the implicit record lock is still active. This check
requires traversal of the trx_sys_t::rw_trx_list while holding the
trx_sys_t::mutex. A transaction cannot commit while this mutex is being held.
During the traversal if an active transaction instance is found for the
corresponding trx_id_t that owns the implicit record lock, acquire the
trx_t::mutex then increment the trx_t::n_ref_count. This will guarantee that the
transaction cannot be committed until the trx_t::n_ref_count drops to zero. The
decrement of the trx_t::n_ref_count is done while holding the trx_t::mutex too.
Before we release a read-write transaction's locks we check the reference count
and do a busy wait for the trx_t::n_ref_count to drop to 0. The assumption behind
the busy wait is that the implicit to explicit conversion is a short operation and
that the busy wait should be sufficient.