STL Complexity Specifications

STL container, algorithm, and concept specifications include asymptotic
complexity specifications. For example, iterators are required to take
constant time, that is the time required by an iterator operation
should be no more than a fixed constant, independent of the size of the
container to which it refers.

Clearly programs will still function if a program component ignores the
complexity specifications. Nonetheless, these specifications are an
important part of the interface between STL components and code that
uses them. If they are ignored, the performance of the resulting
program will often render it useless.

As an example, consider the STL vector container. Ignoring the complexity
specification, it is possible to implement vector using the
same underlying data structure as list, i.e. as a doubly linked
list. But for a vector of length 10,000, this would probably
slow down an average computation of v[i] by something like a
factor of 5,000. For a program that requires many vector
accesses, such as a typical numerical computation, this is likely to
change an execution time of minutes to days.

This does not preclude the use of STL algorithms in conjunction with
containers or iterators that do not meet the standard complexity
specifications. This is occasionally quite useful, especially if the
code is either not performance critical, or other requirements on the
container make the performance specifications unrealizable. But this
has two potential problems. First, the algorithm may no longer be the
right one, or even a reasonable one, for the problem. A different
algorithm may be better tailored to actual relative costs of the
container operations. Second, the algorithm is, of course, unlikely to
satisfy its specified complexity constraint.

The complexity specifications in STL are, of necessity, an
oversimplification. A full specification would describe exactly how the
running time of an operation varies with that of the operations it
invokes. The result would be rather unmanageable for the user, who
would have to be keep track of large amounts of irrelevant detail. It
would be overly constraining on the implementor, since overall
improvements on the existing algorithms may not satisfy such detailed
constraints.

Concept specifications (e.g.Forward Iterator or Container) specify complexity
requirements that should be met by all instances of the concept. This
is the minimum behavior required by operations (e.g.sort) parameterized with respect to the
concept. Any specific instance (e.g.vector) is likely to perform better in at
least some cases.

It is difficult to specify precisely when an algorithm satisfies a
performance constraint. Does copying a vector on a 16-bit
embedded processor take constant time? After all, the size of the
vector is
limited to some value less than 65,536. Thus the number of memory
operations involved in the copy operation is certainly bounded by a
constant. It is even conceivable that the worst case vector
copy time on this processor may be less than the worst-case time for a
single memory access on a machine with paged virtual memory.
Nonetheless, it would be intuitively wrong to describe a vector
copy or a list traversal as being a constant time operation.
Even on this machine, a vector implemented as a list is
unlikely to yield satisfactory performance. (Of course, so would an
implementation that looped for a second for every vector
access, although that would clearly run in constant time. The point
here is to communicate the proper intent between implementor and user,
not to guard against malicious or silly implementations.)

Fundamentally, it is difficult to define the notion of asymptotic
algorithm complexity precisely for real computer hardware instead of an
abstract machine model. Thus we settle for the following guidelines:

For an algorithm A to have running time O(f(n)),
there must be a corresponding algorithm A' that is correct
on machines with arbitrarily long pointer and size_t
types, such that A and A' perform essentially the
same sequence of operations on the actual hardware. (In simple cases A
and A' will be the same. In other cases A may have
been simplified with the knowledge that adresses are bounded.) For
inputs of sufficiently large size n, A' must take at
most time Cf(n), where C is a constant, independent
of both n and the address size. (Pointer, size_t,
and ptrdiff_t operations are presumed to take constant
time independent of their size.)

All container or iterator complexity specifications refer to amortized
complexity. An individual operation may take longer than specified. But
any sufficiently long sequence of operations on the same container or
iterator will take at most as long as the corresponding sum of the
specified operation costs.

Algorithms specify either worst-case or average case performance, and
identify which. Unless otherwise stated, averages assume that container
elements are chosen from a finite type with more possible values than
the size of the container, and that container elements are
independently uniformly distributed.

A complexity specification for an operation f assumes that
operations invoked by f require at most the specified
runtime. But algorithms generally remain appropriate if the invoked
operations are no more than a logarithmic factor slower than specified
in the expected case.

If operations are more expensive than assumed by a function F
in the current STL, then F will slow down at most in
proportion to the added cost. Any future operations that fail to
satisfy this property will make that explicit.

To make this precise, assume F is specified to use time f(m)
for input of size m. F uses operations Gk,
with specified running times gk(n)
on input size n. If F is used in a context in which
each Gk is slower than expected by at most
a factor h(n), then F slows down by at most a factor h(m).
This holds because none of the current algorithms ever apply the
operations Gk to inputs significantly
larger than m.