Mutexes in C, and compatibility with C++

Background

The current C1x draft (WG14 N1425) has a single mutex type
(mtx_t), the properties of which are specified when
each instance is initialized. On the other hand, the C++0x draft
(WG21 N3000) has 4 distinct mutex types
(std::mutex, std::timed_mutex, std::recursive_mutex
and std::recursive_timed_mutex), each with predefined
properties.

This discrepancy was highlighted by Lawrence Crowl in his paper
on C and C++ compatibility in the thread library (WG14 N1423, WG21
N2985), and has since led to much discussion on the C and C++
compatibility reflector, and a series of informal
teleconferences. This paper has come out of these discussions.

Issues

There are two primary issues at stake here:

Sharing of mutexes between C and C++ code, and

Performance of mutexes in C code.

I'll look at each of these in turn.

Compatibility with C++

The usual way for both C and C++ code to operate on a single
type whilst retaining C++ idioms for C++ code is to have the C++
member functions map to C global functions that take the object
address as the first parameter. However, this presupposes that
there is a single common type that shares the same set of
operations.

For the mutexes as currently provided in the two draft
standards this is not the case. Though it is possible to implement
the C++ mutex types in terms of the C mutex type, or indeed to
implement the C mutex type in terms of the C++ types, the
disparate types, and the disparate set of operations that can be
applied means that there is no one-to-one mapping that can be
made.

If the C mtx_t type was to be mapped to the
C++ std::mutex type, then C++ code would not be able
to take advantage of the timed-wait operations on that mutex, even
if the mutex was initialized in C to allow such operations by
passing the mtx_timed flag
to mtx_init. Conversely, the
C++ std::mutex type always
supports try_lock operations, which would only be
supported by the C mtx_t type if
the mtx_try flag was passed
to mtx_init.

Similar problems occur if you map the C mtx_t type
to the C++ std::recursive_timed_mutex type: though
the C++ code can now access every operation theoretically
supported by the C type, this may be a lie in practice if
the mtx_t was not initialized with the full set of
flags.

Such a lie is more of a problem in C++ code than in C code,
since the distinct mutex types guarantee to support the specified
operations, whereas a C programmer must be aware that
the mtx_t type doesn't necessarily support all the
operations, and so he must ensure that the necessary flags have
been supplied.

Condition Variables

The compatibility of mutex types doesn't just affect mutexes;
it affects condition variables too. On the C++
side, condition_variable::wait() requires that
a unique_lock<mutex> be passed, whereas on
the C side a mtx_t* must be passed
to cnd_wait. Combined with the requirement that all
calls to wait() pass the same mutex, this is
clearly a problem if a single condition variable is to be shared
between C and C++ code. If the single mtx_t is
replaced with four distinct mutex types for compatibility with
C++ then this problem goes away —
the cnd_wait function can just take
a mutex* for symmetry with the C++ code.

Performance of C code

Though this paper arose out of the desire for compatibility
between C and C++, I think that the performance of C code that
uses mutexes is potentially a more important issue. This
performance issue is one of the arguments that led to the
provision of distinct mutex types in the C++ standard, and the
issues are just as important for C as for C++.

The key issue is this: on some platforms adding support for
locks with timeouts and/or support for recursive locking is
expensive both in terms of the size of the mutex object and in
terms of the performance of operations upon that object. There is
thus considerable advantage to be gained from having distinct
types: you only pay the cost of the additional operations if you
need them.

The C draft goes someway to acknowledging this fact by allowing
for mutexes with distinct properties which are specified when each
instance is initialized. However, this limits the scope for such
performance gains.

In the first instance, it means that either
the mtx_t type must be large enough to encompass all
the data required for the largest mutex variant, or there must be
some amount of dynamic storage allocated in such cases (thus
adding to the cost of using the mutex).

Secondly, each operation for which the code varies between the
mutex variants must now perform some kind of test and branch to
check which piece of code to use: does this mutex support timed
operations (which therefore requires the use of the slow code with
support for such operations), or does it not (so we can use the
fast code without such support)? Does this mutex support recursive
operations (so we must call the slow code to keep track of the
requisite tracking information), or does it not (so we can omit
the code to handle this tracking information)? Whether this is
coded as a test-and-branch or a call through a function pointer,
there is still an overhead. The same applies for each possible
feature set — on several platforms the support of locks with
timeouts requires a more complex implementation, so even the plain
locking code must change. This is apparent in the Boost
implementation of boost::timed_mutex (see the source
code for POSIX platforms
at https://svn.boost.org/trac/boost/browser/branches/release/boost/thread/pthread/mutex.hpp.)

Real-world performance

As a comparison, I have implemented both the mtx_t
from the current C1x working draft and my proposed C++-compatible
mutex types for linux. On a simple lock-unlock of an uncontended
mutex the performance benefit of mutex_t
over mtx_t is clearly noticeable: on my system,
100000000 lock/unlock pairs takes 4.18s for mutex_t,
4.97s for mtx_t and 5.56s
for pthread_mutex_t (implying 41.8ns vs 49.7ns vs
55.6ns per lock/unlock pair). The separate mutex_t thus
provides a 30% improvement over pthread_mutex_t and
an 18% improvement over mtx_t (which differs only by a
test and branch), which is not to be sniffed at.

My tests also show the timings for recursive mutex locking,
with two locks followed by two unlocks for each cycle. In this case,
my recursive_mutex_t implementation yielded 5.31s for
the 100000000 cycles, whereas pthread_mutex_t took 7.14s
and my mtx_t took 5.94s. Again we see the cost of the
extra test-and-branch to optimize the non-recursive case
of mtx_t, in that it is slower than a
plain recursive_mutex_t, but it is still faster than the
use of pthread_mutex_t.

Note that my tests have focused on the uncontended case: in the
contended case then the cost of a blocking call will tend to mask
such small differences in performance. All tests were run on
Kubuntu 9.10 x64 with an Intel Core 2 Duo 1.83Ghz CPU.

Proposed wording

The following edits are based on the C1x draft in WG14 paper
N1425.

Remove the type mtx_t from 7.24p3, and replace it with
the
types mutex_t, recursive_mutex_t, timed_mutex_t
and recursive_timed_mutex_t:

The types are

cnd_t

which is an object type that holds an identiﬁer for a condition variable;

thrd_t

which is an object type that holds an identiﬁer for a thread;

tss_t

which is an object type that holds an identiﬁer for a thread-speciﬁc
storage pointer;

mtx_tmutex

which is an object type that holds an identiﬁer for
a plain mutex;

recursive_mutex

which is an object type that holds an identiﬁer for
a recursive mutex;

timed_mutex

which is an object type that holds an identiﬁer for
a mutex that supports locks with timeouts;

recursive_timed_mutex

which is an object type that holds an identiﬁer for
a recursive mutex that supports locks with timeouts;

tss_dtor_t

which is the function pointer type void (*)(void*), used for a destructor for a
thread-speciﬁc storage pointer;

thrd_start_t

which is the function pointer type int (*)(void*) that is passed to thrd_create
to create a new thread;

once_flag

which is an object type that holds a ﬂag for use by call_once; and

xtime

which is a structure type that holds a time speciﬁed in seconds and nanoseconds. The
structure shall contain at least the following members, in any order.

time_t sec;
long nsec;

Remove the mutex type enumeration constants from 7.24p4:

The enumeration constants are

mtx_plain

which is passed to mtx_init to create a mutex object that supports neither timeout nor
test and return;

mtx_recursive

which is passed to mtx_init to create a mutex object that supports recursive locking;

mtx_timed

which is passed to mtx_init to create a mutex object that supports timeout;

mtx_try

which is passed to mtx_init to create a mutex object that
supports test and return;

thrd_timeout

which is returned by a timed wait function to indicate that the time speciﬁed in the call
was reached without acquiring the requested resource;

thrd_success

which is returned by a function to indicate that the requested operation succeeded;

thrd_busy

which is returned by a function to indicate that the requested operation failed because a
resource requested by a test and return function is already in use;

thrd_error

which is returned by a function to indicate that the requested operation failed; and

thrd_nomem

which is returned by a function to indicate that the requested operation failed because it
was unable to allocate memory.

Change the references to mtx_t in 7.24.3.5 and
7.24.3.6 to mutex_t:

7.24.3.5 The cnd_timedwait function

Synopsis

int cnd_timedwait(cnd_t *cond, mtx_tmutex *mtx,const xtime *xt);

7.24.3.6 The cnd_wait function

Synopsis

int cnd_wait(cnd_t *cond, mtx_tmutex *mtx);

Replace the entirety of 7.24.4 with the following:

7.24.4 Mutex functions

7.24.4.1 The mutex_destroy function

Synopsis

void mutex_destroy(mutex_t *mtx);

Description

The mutex_destroy function releases any resources used by the mutex pointed to by
mtx. No threads can be blocked waiting for the mutex pointed to by mtx.

Returns

The mutex_destroy function returns no
value.

7.24.4.2 The mutex_init function

Synopsis

int mutex_init(mutex_t *mtx);

Description

The mutex_init function creates a mutex object. If
the mutex_init function succeeds, the mutex pointed
to by mtx is initialized to a valid mutex that is
distinct from all other mutexes in the program.

Returns

The mutex_init function
returns thrd_success on success,
or thrd_error if the request could not be
honored.

7.24.4.3 The mutex_lock function

Synopsis

int mutex_lock(mutex_t *mtx);

Description

The mutex_lock function blocks until it locks the mutex pointed to by mtx. The mutex shall not be locked by the calling thread. Prior calls to mutex_unlock
on the same mutex shall synchronize with this operation.

Returns

The mutex_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.4 The mutex_try_lock function

Synopsis

int mutex_try_lock(mutex_t *mtx);

Description

The mutex_try_lock function endeavors to lock the mutex pointed to by mtx. If the
mutex is already locked, the function returns without blocking. Prior calls to
mutex_unlock on the same mutex shall synchronize with this operation.

Returns

The mutex_try_lock function returns thrd_success on success, or thrd_busy if
the resource requested is already in use, or thrd_error if the request could not be
honored.

7.24.4.5 The mutex_unlock function

Synopsis

int mutex_unlock(mutex_t *mtx);

Description

The mutex_unlock function unlocks the mutex pointed to by mtx. The mutex pointed to
by mtx shall be locked by the calling thread.

Returns

The mutex_unlock function returns thrd_success on success or thrd_error if
the request could not be honored.

7.24.4.6 The recursive_mutex_destroy function

Synopsis

void recursive_mutex_destroy(recursive_mutex_t *mtx);

Description

The recursive_mutex_destroy function releases any resources used by the mutex pointed to by
mtx. No threads can be blocked waiting for the mutex pointed to by mtx.

Returns

The recursive_mutex_destroy function returns no
value.

7.24.4.7 The recursive_mutex_init function

Synopsis

int recursive_mutex_init(recursive_mutex_t *mtx);

Description

The recursive_mutex_init function creates a mutex object. If
the recursive_mutex_init function succeeds, the mutex pointed
to by mtx is initialized to a valid mutex that is
distinct from all other mutexes in the program.

Returns

The recursive_mutex_init function
returns thrd_success on success,
or thrd_error if the request could not be
honored.

7.24.4.8 The recursive_mutex_lock function

Synopsis

int recursive_mutex_lock(recursive_mutex_t *mtx);

Description

The recursive_mutex_lock function blocks until it
locks the mutex pointed to by mtx. If the mutex is
already locked by the calling thread then the call shall return
without blocking. Prior calls to recursive_mutex_unlock
on the same mutex shall synchronize with this operation.

Returns

The recursive_mutex_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.9 The recursive_mutex_try_lock function

Synopsis

int recursive_mutex_try_lock(recursive_mutex_t *mtx);

Description

The recursive_mutex_try_lock function endeavors to
lock the mutex pointed to by mtx. If the mutex is already
locked, the function returns without blocking. Prior calls to
recursive_mutex_unlock on the same mutex shall
synchronize with this operation.

Returns

The recursive_mutex_try_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.10 The recursive_mutex_unlock function

Synopsis

int recursive_mutex_unlock(recursive_mutex_t *mtx);

Description

The recursive_mutex_unlock function unlocks the
mutex pointed to by mtx. The mutex pointed to
by mtx shall be locked by the calling thread.

Returns

The recursive_mutex_unlock function returns thrd_success on success or thrd_error if
the request could not be honored.

7.24.4.11 The timed_mutex_destroy function

Synopsis

void timed_mutex_destroy(timed_mutex_t *mtx);

Description

The timed_mutex_destroy function releases any resources used by the mutex pointed to by
mtx. No threads can be blocked waiting for the mutex pointed to by mtx.

Returns

The timed_mutex_destroy function returns no
value.

7.24.4.12 The timed_mutex_init function

Synopsis

int timed_mutex_init(timed_mutex_t *mtx);

Description

The timed_mutex_init function creates a mutex object. If
the timed_mutex_init function succeeds, the mutex pointed
to by mtx is initialized to a valid mutex that is
distinct from all other mutexes in the program.

Returns

The timed_mutex_init function
returns thrd_success on success,
or thrd_error if the request could not be
honored.

7.24.4.13 The timed_mutex_lock function

Synopsis

int timed_mutex_lock(timed_mutex_t *mtx);

Description

The timed_mutex_lock function blocks until it locks the mutex pointed to by mtx. The mutex shall not be locked by the calling thread. Prior calls to timed_mutex_unlock
on the same mutex shall synchronize with this operation.

Returns

The timed_mutex_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.14 The timed_mutex_try_lock function

Synopsis

int timed_mutex_try_lock(timed_mutex_t *mtx);

Description

The timed_mutex_try_lock function endeavors to lock the mutex pointed to by mtx. If the
mutex is already locked, the function returns without blocking. Prior calls to
timed_mutex_unlock on the same mutex shall synchronize with this operation.

Returns

The timed_mutex_try_lock function returns thrd_success on success, or thrd_busy if
the resource requested is already in use, or thrd_error if the request could not be
honored.

7.24.4.15 The timed_mutex_try_lock_until function

Synopsis

int timed_mutex_try_lock_until(timed_mutex_t *mtx, const xtime *xt);

Description

The timed_mutex_try_lock_until function endeavors to
block until it locks the mutex pointed to by mtx or
until the point in time speciﬁed by the xtime
object xt has passed. If the mutex is already locked,
the function returns without blocking. Prior calls to
timed_mutex_unlock on the same mutex shall
synchronize with this operation.

Returns

The timed_mutex_try_lock_until function
returns thrd_success on success,
or thrd_busy if the resource requested is already in
use, or thrd_timeout if the point in time speciﬁed
was reached without acquiring the requested resource,
or thrd_error if the request could not be
honored.

7.24.4.16 The timed_mutex_unlock function

Synopsis

int timed_mutex_unlock(timed_mutex_t *mtx);

Description

The timed_mutex_unlock function unlocks the mutex pointed to by mtx. The mutex pointed to
by mtx shall be locked by the calling thread.

Returns

The timed_mutex_unlock function returns thrd_success on success or thrd_error if
the request could not be honored.

7.24.4.17 The recursive_timed_mutex_destroy function

Synopsis

void recursive_timed_mutex_destroy(recursive_timed_mutex_t *mtx);

Description

The recursive_timed_mutex_destroy function releases any resources used by the mutex pointed to by
mtx. No threads can be blocked waiting for the mutex pointed to by mtx.

Returns

The recursive_timed_mutex_destroy function returns no
value.

7.24.4.18 The recursive_timed_mutex_init function

Synopsis

int recursive_timed_mutex_init(recursive_timed_mutex_t *mtx);

Description

The recursive_timed_mutex_init function creates a mutex object. If
the recursive_timed_mutex_init function succeeds, the mutex pointed
to by mtx is initialized to a valid mutex that is
distinct from all other mutexes in the program.

Returns

The recursive_timed_mutex_init function
returns thrd_success on success,
or thrd_error if the request could not be
honored.

7.24.4.19 The recursive_timed_mutex_lock function

Synopsis

int recursive_timed_mutex_lock(recursive_timed_mutex_t *mtx);

Description

The recursive_timed_mutex_lock function blocks until it
locks the mutex pointed to by mtx. If the mutex is
already locked by the calling thread then the call shall return
without blocking. Prior calls to recursive_timed_mutex_unlock
on the same mutex shall synchronize with this operation.

Returns

The recursive_timed_mutex_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.20 The recursive_timed_mutex_try_lock function

Synopsis

int recursive_timed_mutex_try_lock(recursive_timed_mutex_t *mtx);

Description

The recursive_timed_mutex_try_lock function endeavors to
lock the mutex pointed to by mtx. If the mutex is already
locked, the function returns without blocking. Prior calls to
recursive_timed_mutex_unlock on the same mutex shall
synchronize with this operation.

Returns

The recursive_timed_mutex_try_lock function
returns thrd_success on success,
or thrd_busy if the resource requested is already in use,
or thrd_error if the request could not be
honored.

7.24.4.21 The recursive_timed_mutex_try_lock_until function

The recursive_timed_mutex_try_lock_until function
endeavors to block until it locks the mutex pointed to
by mtx or until the point in time speciﬁed by
the xtime object xt has passed. If the
mutex is already locked, the function returns without
blocking. Prior calls to
recursive_timed_mutex_unlock on the same mutex shall
synchronize with this operation.

Returns

The recursive_timed_mutex_try_lock_until function
returns thrd_success on success,
or thrd_busy if the resource requested is already in
use, or thrd_timeout if the point in time speciﬁed
was reached without acquiring the requested resource,
or thrd_error if the request could not be
honored.

7.24.4.22 The recursive_timed_mutex_unlock function

Synopsis

int recursive_timed_mutex_unlock(recursive_timed_mutex_t *mtx);

Description

The recursive_timed_mutex_unlock function unlocks the
mutex pointed to by mtx. The mutex pointed to
by mtx shall be locked by the calling thread.

Returns

The recursive_timed_mutex_unlock function returns thrd_success on success or thrd_error if
the request could not be honored.