The Concurrency TS introduced two atomic smart pointer classes,
atomic_shared_ptr and atomic_weak_ptr, as a superior
alternative to the atomic access API for shared_ptr in the C++
standard. This paper highlights several issues with that specification that
should be resolved before merging its contents into a future C++ standard.

The C++ standard provides an API to access and manipulate specific
shared_ptr objects atomically, i.e., without introducing data races
when the same object is manipulated from multiple threads without further
synchronization. This API is fragile and error-prone, as shared_ptr
objects manipulated through this API are indistinguishable from other
shared_ptr objects, yet subject to the restriction that they may be
manipulated/accessed only through this API. In particular, you cannot
dereference such a shared_ptr without first loading it into another
shared_ptr object, and then dereferencing through the second object.

The Concurrency TS addresses this fragility with a class template to implicitly
wrap a shared_ptr and guarantee to access it only through the atomic
access API. It provides a similar wrapper for weak_ptr, although the
standard does not provide a corresponding atomic access API for
weak_ptrs.

The atomic wrapper classes are placed in the <atomic> header,
which is part of the free-standing library, while the smart pointer classes it
wraps, and the corresponding atomic access API, are located in the
<memory> header, which is not free-standing.

The atomic wrapper classes are designed to look like the named type aliases for
specific instantiations of the atomic template, but are distinct class
templates in their own right. While this risks confusing users, the
specification is in terms of the primary atomic template, which may
lead to subtle wording issues where the name of the template is involved. It
also means that the smart pointer wrapper classes are not compatible with other
functions in the <atomic> header that act on the atomic
template, such as (C compatible) free-function APIs that correspond to member
functions.

The whole feature of wrapping smart pointers in an atomic type relies on
contents of the <memory> header, including (as currently
specified) the atomic access API to be invoked. Rather than create a
dependency between the free-standing <atomic> header and the
hosted <memory> header, we should simply move the whole
smart-pointer specification into the <memory> header.

The smart pointer wrapper types in the Concurrency TS are deliberately named to
look like a named alias of the atomic template. However, as they are
not in fact such an alias, they are not usable in many APIs from the
<atomic> header that are truly specified in terms of the
atomic template. This is a recipe for confusion.

Rather than rename the atomic wrapper templates, this paper suggests revisiting
the reason we chose to make such class templates in the first place, rather
than more naturally providing partial specializations for the atomic
class template. In particular, it appears that the reason we deferred on the
original proposal
(N4058)
specializing the atomic template was concern about the constraint that
T must be trivially copyable for atomic<T>. However,
this is fundamentally a constraint on what the library is committing to support
when provided an otherwise unknown type by a user, rather than a constraint on
what the library itself can support with more knowledge of a specific type. In
particular, it is not clear how to specify constraints for the primary template
to support non-trivial types, while it is clear how to specify known special
cases such as smart pointers. Forcing the library to use a different interface
will produce a substandard user experience, for no apparent gain.

This paper proposes restoring the original proposal's idea for
atomic<smart-pointer> and supplying named alias
templates, as per the atomic integral classes. We note in passing that there
is no similar alias-template for:

template <typename T>
using atomic_pointer = atomic<T*>;

This would probably be a sought-for addition once the atomic wrappers have
their specific names. However, it is not proposed in the initial draft of this
paper.

The atomic access API in clause 23 is fragile and easy to misuse. It overloads
the atomic API for atomic objects in the <atomic> header with
identical names and signatures for manipulating non-atomic objects. There is
no other precedent for this (existing) API. It should be deprecated in favor
of the new atomic smart pointer support. However, the atomic types in the
Concurrency TS are specified in terms of that same API.

However, looking at the specification for the API in C++17, we see that it
should be simple to rewrite the specification for the atomic pointer without
reference to this API.

First, consumers of this API can access the smart pointer object only through
this API. This would potentially be giving up an interoperability feature, but
there is no public access to the wrapped smart pointer, so no other code can
interact (directly) with our wrapped objects using this API. No generality is
given up.

Secondly, each API documents a requirement that it is not called with null
pointers, and this is not a useful precondition when replaced with a
member-function based API, which can guarantee such preconditions are always
satisfied without documenting them.

The important reason these functions are documented separately to those in
clause 32 is to highlight that smart pointer objects have the same value when
both the held pointer values are the same, and the shared control blocks are
the same. This is not the same as the simple equality operator on these types,
which compares only the held pointer value, and in fact, weak_ptr does
not even offer an operator== to test. Similarly, the exchange
semantics are specified as-if calling the member swap function.

Finally, there is no such atomic access API for weak_ptr at all, and
it is simpler to directly specify the operations of the wrapper template than
to provide a second fragile API to use by reference.

There are a small number of minor nits in the specification that should also be
cleaned up.

The Concurrency TS seems to assume that it is sufficient to state that a smart
pointer is empty, assuming that means it is also null. However, through the
aliasing constructor, shared_ptr objects can still hold valid pointers
while empty, i.e., when not owning an object.

The Concurrency TS also assumes there is a "valid-but-unspecified"
moved-from state for shared and weak pointers, but the specification for these
types is precise in all cases, and objects are left empty after a successful
move operation. It further gives the guarantee that lvalue references are not
accessed again after the atomic step, but for no obvious reason does not give
the same guarantee in the case of rvalue references. Given the state must be
reset to empty (by shared/weak-pointer semantics) as part of the atomic update,
there is no reason to not give the same guarantee for all reference types.

The Concurrency TS specification is missing the value_type type alias,
and the static constant member is_always_lock_free.

The Concurrency TS types do not return a value from operator=, unlike
the specification for atomic that they are defined in terms of. It
would be very easy to non-atomically return the supplied argument after atomically
storing the value, matching the primary atomic template, but the author
prefers to keep the specification as close to the TS as possible, despite his own
preference for preserving the primary template interfaces as much as possible.

The primary atomic template has a volatile-qualified overload
for every member function. This allows for the idiom that objects that may be
modified outside the current thread are maked as volatile to raise
their visibility. The original proposal, as adopted by the TS and repeated in
this paper, is to not support that for atomic smart pointers. Without casting
prejudice on the idiom, non-atomic smart pointers do not have the overloaded
constructors taking volatile references, so the idiom is not supported
by the underlying types. This is one way that fundamentaly types typically
differ from user-defined types, and the clause 32 specializations of
atomic are dealing only with fundamental types that transparently
handle the volatile qualifier.

The TS specification splits compare_exchange methods taking a non-atomic
argument by value into two overloads for lvalues and rvalues, allowing for
the most efficient argument passing for smart pointers. It does not do the
same for other members that would simlarly benefit, such as the by-value
constructor and the assignment operator. This paper conservatively does not
propose such a split either, while observing the original specification is
in terms of a free-function API that is similarly specified to take
shared_ptr objects by value, so there would be no benefit to making
that split in the original specification. While the split seems beneficial
once freed from that specification, the author has no implementation
experience to confirm that it is would be implementable or valuable.

As noted above, the partial specializations for atomic smart pointers could
better respect the primary template interface, such as by (non-atomically)
returning a value from the assignment operator.

Make the following changes to the specified working paper. Note that as the
proposed wording is not yet present in the C++ Working Paper, and there are no
outstanding issues filed against these clauses of the Concurrency TS, we do not
feel bound by the existing stable names for the clauses, and so propose new
stable names appropriate for landing in the current C++ Working Paper.

Update the synopsis for the <memory> header as follows. Note
that the addition of reinterpret_pointer_cast is an editorial
drive-by, the function is already part of C++17.

23.10.2 Header <memory> synopsis [memory.syn]

The header <memory> defines several types and function templates
that describe properties of pointers and pointer-like types, manage memory for
containers and other template types, destroy objects, and construct multiple
objects in uninitialized memory buffers (23.10.3–23.10.10). The header also
defines the templates unique_ptr, shared_ptr, weak_ptr,
and various function templates that operate on objects of these types (23.11).

The library shall provide partial specializations of the atomic template
shared-ownership smart pointers. The behavior of all operations is as specified
in §32.6 [atomics.types.generic], unless specified otherwise. The template
parameter T of these partial specializations may be an incomplete type.

All changes to an atomic smart pointer in this subclause, and all associated
use_count increments, are guaranteed to be performed atomically.
Associated use_count decrements shall be sequenced after the atomic
operation, but are not required to be part of it. Any associated deletion and
deallocation are sequenced after the atomic update step and shall not be part
of the atomic operation. [ Note: If the atomic operation uses locks,
locks acquired by the implementation shall be held when any use_count
adjustments are performed, and shall not be held when any destruction or
deallocation resulting from this is performed. — end note ]

For the purpose of compare_exchange operations, two weak_ptr objects are
equivalent if they are both empty, or if they shared ownership and store the same
pointer value.

Effects: Initializes the object with the value desired.
Initialization is not an atomic operation (4.7). [ Note: It is possible
to have an access to an atomic object A race with its construction,
for example by communicating the address of the just-constructed object
A to another thread via memory_order_relaxed operations on a
suitable atomic pointer variable, and then immediately accessing A in
the receiving thread. This results in undefined behavior. — end note ]

Effects: If p is equivalent to expected, assigns
desired to p and has synchronization semantics corresponding
to the value of success, otherwise assigns p to
expected and has synchronization semantics corresponding to the value
of failure.

Returns:true if p was equivalent to expected,
false otherwise.

Remarks: Two shared_ptr objects are equivalent if they store
the same pointer value and either share ownership, or both are empty. The weak
form may fail spuriously. See 32.6.1.

If the operation returns true, expected is not accessed after
the atomic update and the operation is an atomic read-modify-write operation
(4.7) on the memory pointed to by this. Otherwise, the operation is an atomic
load operation on that memory, and expected is updated with the
existing value read from the atomic object in the attempted atomic
update. The use_count update corresponding to the write to
expected is part of the atomic operation. The write to
expected itself is not required to be part of the atomic operation.

If the desired value is passed by rvalue reference and the operation
returns true, then desired is empty after the atomic update
step.

Effects: Equivalent to: return compare_exchange_weak(expected, desired, order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_weak(expected, std::move(desired), order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_strong(expected, desired, order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_strong(expected, std::move(desired), order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Initializes the object with the value desired.
Initialization is not an atomic operation (4.7). [ Note: It is possible
to have an access to an atomic object A race with its construction,
for example by communicating the address of the just-constructed object
A to another thread via memory_order_relaxed operations on a
suitable atomic pointer variable, and then immediately accessing A in
the receiving thread. This results in undefined behavior. — end note ]

Effects: If p is equivalent to expected, assigns
desired to p and has synchronization semantics corresponding
to the value of success, otherwise assigns p to
expected and has synchronization semantics corresponding to the value
of failure.

Returns:true if p was equivalent to expected,
false otherwise.

Remarks: Two weak_ptr objects are equivalent if they store
the same pointer value and either share ownership, or both are empty. The weak
form may fail spuriously. See 32.6.1.

If the operation returns true, expected is not accessed after
the atomic update and the operation is an atomic read-modify-write operation
(4.7) on the memory pointed to by this. Otherwise, the operation is an atomic
load operation on that memory, and expected is updated with the
existing value read from the atomic object in the attempted atomic
update. The use_count update corresponding to the write to
expected is part of the atomic operation. The write to
expected itself is not required to be part of the atomic operation.

If the desired value is passed by rvalue reference and the operation
returns true, then desired is empty after the atomic update
step.

Effects: Equivalent to: return compare_exchange_weak(expected, desired, order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_weak(expected, std::move(desired), order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_strong(expected, desired, order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Effects: Equivalent to: return compare_exchange_strong(expected, std::move(desired), order, fail_order);
where fail_order is the same as order except that a value of
memory_order_acq_rel shall be replaced by the value
memory_order_acquire and a value of memory_order_release
shall be replaced by the value memory_order_relaxed.

Move the old atomic support for shated pointers into Annex D:

23.11.2.6 shared_ptr atomic access [util.smartptr.shared.atomic]

Concurrent access to a shared_ptr object from multiple threads does
not introduce a data race if the access is done exclusively via the functions
in this section and the instance is passed as their first argument.

The meaning of the arguments of type memory_order is explained in 32.4.

Requires:p shall not be null and v shall not be null. The failure argument
shall not be memory_order_release nor memory_order_acq_rel.

Effects: If *p is equivalent to *v, assigns w to *p and has synchronization
semantics corresponding to the value of success, otherwise assigns *p to *v and
has synchronization semantics corresponding to the value of failure.

Returns:true if *p was equivalent to *v, false otherwise.

Throws: Nothing.

Remarks: Two shared_ptr objects are equivalent if they store the same pointer
value and share ownership. The weak form may fail spuriously. See 32.6.1.

Concurrent access to a shared_ptr object from multiple threads does
not introduce a data race if the access is done exclusively via the functions
in this section and the instance is passed as their first argument.

The meaning of the arguments of type memory_order is explained in 32.4.

Requires:p shall not be null and v shall not be null. The failure argument
shall not be memory_order_release nor memory_order_acq_rel.

Effects: If *p is equivalent to *v, assigns w to *p and has synchronization
semantics corresponding to the value of success, otherwise assigns *p to *v and
has synchronization semantics corresponding to the value of failure.

Returns:true if *p was equivalent to *v, false otherwise.

Throws: Nothing.

Remarks: Two shared_ptr objects are equivalent if they store the same pointer
value and share ownership. The weak form may fail spuriously. See 32.6.1.

Thanks to Herb Sutter, not only for the original proposal that was adopted for
the TS, but especially for being available on short notice to advise on the
history and rationale of the feature for this paper. Thanks to Stephan T.
Lavavej for a detailed review of an early draft of this paper that improved it
in many ways.