Improving shared_ptr for C++0x

While shared_ptr has already proven its utility time and again, we
have observed frequent requests for enhancements in several legitimate and key areas:

Ability to control its internal allocations;

Aliasing support;

A factory function that eliminates the need for an explicit new expression;

Reduced reference counting overhead when using shared_ptr in "mostly
unique" ownership situations such as containers;

A variant of shared_ptr that is atomic, that is, safe to be
manipulated from multiple threads without synchronization;

A way to reclaim cyclic shared_ptr structures;

A variant of shared_ptr that is confined to a single thread and uses
unsynchronized reference count updates.

This document proposes additions to the C++0x standard to address the first six.
Some of the proposed additions are essentially a subset of those presented in N1851
by Kliatchko and Rocha. We refer the interested reader to that paper for a more
extensive rationale.

This proposal makes use of variadic templates and rvalue references. These C++0x
features markedly improve the usability and effectiveness of several of the proposed
additions.

The default behavior of shared_ptr is to allocate its control block
using new. This precludes its use in contexts where uncontrolled dynamic
allocations are not allowed, such as embedded environments or the implementation
of ::operator new. The proposed addition allows the user to supply
an allocator that shared_ptr will use.

Impact:

This feature affects the interface of shared_ptr, allowing its broader
use, and is therefore strongly recommended to be added to the C++0x standard.

Proposed text:

Add to shared_ptr [util.smartptr.shared] the following constructor:

template<class Y, class D, class A> shared_ptr( Y * p, D d, A a );

and the following member function:

template<class Y, class D, class A> void reset( Y * p, D d, A a );

Change the section:

template<class Y, class D> shared_ptr( Y * p, D d );

Requires:p shall be convertible to T*. D
shall be CopyConstructible. The copy constructor and destructor of
D shall not throw exceptions. The expression d(p) shall
be well-formed, shall have well defined behavior, and shall not throw exceptions.

Effects: Constructs a shared_ptr object that owns
the pointer p and the deleter d.

Requires:p shall be convertible to T*. D
shall be CopyConstructible. The copy constructor and destructor of
D shall not throw. The expression d(p) shall be well-formed,
shall have well defined behavior, and shall not throw. A shall be an
allocator [allocator.requirements]. The copy constructor and destructor
of A shall not throw.

Effects: Constructs a shared_ptr object that owns
the pointer p and the deleter d. The second constructor
shall use a copy of a to allocate memory for internal use.

Add the following to [util.smartptr.shared.mod]:

template<class Y, class D, class A> void reset( Y * p, D d, A a );

Effects: Equivalent to shared_ptr( p, d, a ).swap( *this ).

Implementability:

This feature has been added to boost::shared_ptr and will be part of
Boost 1.35. See:

Advanced users often require the ability to create a shared_ptr instance
p that shares ownership with another (master) shared_ptrq but points to an object that is not a base of *q.
*p may be a member or an element of *q, for example. This
section proposes an additional constructor that can be used for this purpose.

An interesting side effect of this increase of expressive power is that now the
*_pointer_cast functions can be implemented in user code. The make_shared
factory function presented later in this document can also be implemented using
only the public interface of shared_ptr via the aliasing constructor.

Impact:

This feature affects the interface of shared_ptr in a way that increases
its expressive power and is therefore strongly recommended to be added to the C++0x
standard.

Proposed text:

Add to shared_ptr [util.smartptr.shared] the following constructor:

template<class Y> shared_ptr( shared_ptr<Y> const & r, T * p );

Add the following to [util.smartptr.shared.const]:

template<class Y> shared_ptr( shared_ptr<Y> const & r, T * p );

Effects: Constructs a shared_ptr instance that stores p
and shares ownership with r.

Postconditions:get() == p && use_count() == r.use_count().

Throws: nothing.

[Note: To avoid the possibility of a dangling pointer, the user of this
constructor must ensure that p remains valid at least until the ownership
group of r is destroyed. --end note.]

[Note: This constructor allows creation of an emptyshared_ptr
instance with a non-NULL stored pointer. --end note.]

Implementability:

This feature has been added to boost::shared_ptr and will be part of
Boost 1.35. See:

Consistent use of shared_ptr can eliminate the need to use an explicit
delete, but it currently provides no support in avoiding explicit
new. There have been repeated requests from users for a factory function
that creates an object of a given type and returns a shared_ptr to
it. Besides convenience and style, such a function is also exception safe and considerably
faster because it can use a single allocation for the object and its corresponding
control block, eliminating a significant portion of shared_ptr's construction
overhead. This function eliminates one of the major efficiency complaints about
shared_ptr.

This section proposes a family of overloaded function templates, make_shared<T>
and allocate_shared<T>, to address this need. make_shared
uses the global operator new to allocate memory, whereas allocate_shared
uses an user-supplied allocator, allowing finer control consistent with
section II of this document.

Impact:

This feature does not affect the interface of shared_ptr. It is possible
to implement in a non-intrusive way using only the public interface, as long
as aliasing support is present. Access to implementation details can
eliminate between 5 and 8 bytes of storage overhead on a typical 32 bit platform.

The addition is a strong candidate for the C++0x standard, but can be relegated
to a technical report.

Proposed text:

In the following, the ... notation denotes a variable number of arguments. Once
variadic templates are voted into the WP, the intent is to replace this notation
with the appropriate variadic template syntax, for consistency with
N2192.

Requires: The expression new( pv ) T() or new( pv ) T( std::forward<A1>(a1),
..., std::forward<An>(an) ), where pv is a void*
pointing to storage suitable to hold an object of type T, shall be
well-formed and shall have well defined behavior. A shall be an allocator
[allocator.requirements]. The copy constructor and destructor of A
shall not throw.

Effects: Allocates memory suitable for an object of type T
and constructs an object in it via the placement new expression new( pv ) T()
or new( pv ) T( std::forward<A1>(a1), ..., std::forward<An>(an) ).
allocate_shared uses a copy of a to allocate memory.

Returns: A shared_ptr instance that stores and owns the address
of the newly constructed object of type T.

Postconditions:get() != 0 && use_count() == 1.

Throws:bad_alloc, or an implementation-defined exception
when a resource other than memory could not be obtained.

Exception safety: If an exception is thrown, has no effect.

[Note: These functions will typically allocate more memory than sizeof(T)
to allow for internal bookkeeping structures such as the reference counts. --end
note]

Implementability:

This implementation uses variadic templates and rvalue references. When these features
are not available, it falls back on a family of overloaded function templates taking
arguments by const reference.

Users often express concerns over the cost of copying a shared_ptr
in situations where the source of the copy is no longer needed. To address this
use case,
N1851 proposes a separate smart pointer, managed_ptr, that
is convertible from and to shared_ptr and enforces unique ownership.

The current proposal does not take this approach. Instead, we propose that move
constructors and move assignment operators be added to shared_ptr.
This allows a shared_ptr to be as efficient as an auto_ptr
or the proposed unique_ptr when the source of the copy or assignment
is a temporary or no longer needed. Move-aware standard containers will automatically
take advantage of this optimization. As an example of the consequences, reallocating
a vector< shared_ptr<T> > will no longer entail any reference
count updates.

Impact:

This feature affects the interface of shared_ptr in a way that reduces
its copy overhead and is in line with the rvalue recommendations for the standard
library presented in
N1859-N1862.
We believe that it is a strong candidate for addition to the C++0x standard.

Implementability:

The level of thread safety offered by shared_ptr is the default for
the standard library: const operations on the same instance can be performed concurrently
by multiple threads, but updates require exclusive access. Some scenarios require
stronger guarantees; there is considerable demand (on the Boost mailing lists and
otherwise) for an atomicshared_ptr, one that can withstand
concurrent updates from two threads without synchronization.

Instead of providing a separate atomic smart pointer, we propose additions to
shared_ptr's interface to address this use case. A typical example scenario
— a "lock-free" reader/writer pattern — that takes advantage of this
functionality is outlined below:

Note that the code carefully avoids destroying a non-empty shared_ptr
instance while holding the spinlock, as this may lead to user code being called
and present an opportunity for a deadlock. It also attempts to limit the spinlock
scope to the minimum necessary to prevent contention.

The inability of shared_ptr to handle cyclic structures has long been
cited as one of its important downsides. The proposed addition offers a way for
users to opt to reclaim cyclic structures via implementing a support interface for
their nodes.
It imposes no overhead on existing non-cyclic uses that do not implement the interface.

Impact:

This feature does not affect the interface of shared_ptr, but requires
support from the implementation in a similar way to enable_shared_from_this.
We believe that it is therefore a candidate for addition to the C++0x standard.

Proposed text:

Add to the synopsis of <memory> the following:

class enable_sp_collect;
void sp_collect();

Add the following two sections after [util.smartptr.enab]:

Class enable_sp_collect

A class that derives from enable_sp_collect via public and unambiguous
inheritance is called collectible. The implementation tracks all collectible
classes owned by shared_ptr instances and can reclaim cycles of unreachable
collectible objects.

Effects: Appends all shared_ptr instances to collectible
objects that *this owns to v.

virtual void sp_reset() = 0;

Effects: Resets or destroys all shared_ptr instances to collectible
objects that *this owns.

Throws: nothing.

[Note: This function may be called while the collector is holding a lock.
Performing anything else besides resetting the shared_ptr instances
is strongly discouraged as it may lead to deadlocks. --end note]

sp_collect

namespace std {
void sp_collect();
}

Effects: For every collectible object owned by a shared_ptr,
calls sp_enumerate. Performs reachability analysis on the returned
data and identifies the unreachable collectible objects. Retains temporary shared_ptr
references to all unreachable objects. For every unreachable object, calls sp_reset.
Drops the temporary references, causing the destruction of the unreachable objects.

[Note: This function is allowed to be called in a separate thread. --end
note]

[Note: The reachability analysis may be performed using Christopher's algorithm.
Using the instance graph, compute the use counts the objects would have if there
were no external shared_ptr instances. Compare the computed use counts
with the actual use counts. For every mismatch, mark the object as reachable from
the outside. Propagate reachability throughout the graph. --end note]