Using Boost Libraries

some arbitrary hints and notes regarding the use of Boost Libraries in Lumiera

Notable Features

Some of the Boost Libraries are especially worth mentioning. You should familiarise yourself with
those features, as we’re using them heavily throughout our code base. As it stands, the C/C++ language(s)
are quite lacking and outdated, compared with today’s programming standards. To fill some of these gaps
and to compensate for compiler deficiencies, some members of the C++ committee and generally very
knowledgeable programmers created a set of C++ libraries generally known as Boost. Some of these
especially worthy additions were proposed and subsequently included into the new C++11 standard.

C++11 Features

In Lumiera, we heavily rely on some features proposed and matured in the Boost libraries,
and meanwhile included in the current language standard. These features are thus now provided
through the standard library accompanying the compiler. For sake of completeness, we’ll mention
them here. In the past we used the Boost-Implementation of these facilities, but we don’t need
Boost for this purpose anymore.

memory

The <memory> libraries (formerly <boost/memory.hpp> rsp. <tr1/memory>) define a family of smart-pointers
to serve several needs of basic memory management. In almost all cases, they’re superior to using std::auto_ptr.
When carefully combining these nifty templates with the RAII pattern, most concerns for memory
management, clean-up and error handling simply go away. (but please understand how to avoid
circular references and care for the implications of parallelism though)

functional

The function template adds generic functor objects to C++. In combination with the bind function
(which binds or ties an existing function invocation into a functor object), this allows to “erase” (hide)
the difference between functions, function pointers and member functions at your interfaces and thus enables
building all sorts of closures, signals (generic callbacks) and notification services. Picking up on these
concepts might be mind bending at start, but it’s certainly worth the effort (in terms of programmer
productivity)

hashtables and hash functions

The unordered_* collection types amend a painful omission in the STL. To work properly, these collection
implementations need a way to calculate a hash value for every key (rsp. entry in case of the Set-container).
The hash function to use can be defined as additional parameter; there are also some conventions to pick a
hash function automatically. Currently (2014), we have two options for the hash function implementation:
The std::hash or the boost::hash implementation.
(→ read more here…)

STATIC_ASSERT

a helper to check and enforce some conditions regarding types at compile time.
In case of assertion failure a compilation error is provoked, which should at least give a clue
towards the real problem guarded by the static assertion. It is good practice to place an extended
source code comment near the static assertion statement to help solving the actual issue.

Relevant Bosst extensions

operators

The boost::operators library allows to build families of types/objects with consistent
algebraic properties. Especially, it eases building equality comparable, totally ordered,
additive, mulitplicative entities: You’re just required to provide some basic operators
and the library will define all sorts of additional operations to care for the logical
consequences, removing the need for writing lots of boilerplate code.

lexical_cast

Converting numerics to string and back without much ado, as you’d expect it from any decent language.

boost::format

String formatting with printf style directives. Interpolating values into a template string for
formatted output — but typesafe, using defined conversion operators and without the dangers of
the plain-C printf famility of functions. But beware: boost::format is implemented on top of
the C++ output stream operations (<< and manipulators), which in turn are implemented based
on printf — you can expect it to be 5 to 10 times slower than the latter, and it has
quite some compilation overhead and size impact (→ see our own
formatting front-end
to reduce this overhead)

metaprogramming library

A very elaborate, and sometimes mind-bending library and framework. While heavily used within
Boost to build the more advanced features, it seems too complicated and esoteric for general purpose
and everyday use. Code written using the MPL tends to be very abstract and almost unreadable for
people without math background. In Lumiera, we try to avoid using MPL directly. Instead, we
supplement some metaprogramming helpers (type lists and tuples), written in a simple LISP style,
which — hopefully — should be decipherable without having to learn an abstract academic
terminology and framework.

variant and any

These library provide a nice option for building data structures able to hold a mixture of
multiple types, especially types not directly related to each other. boost::variant is a
typeseafe union record, while boost::any is able to hold just any other type you provide
at runtime, still with some degree of type safety when retrieving the stored values.
Both libraries are compellingly simple to use, yet add some overhead in terms of size,
runtime, and compile time.

regular expressions

Boost provides a full blown regular expression library, supporting roughly the feature set of
perl regular expressions. The usage and handling is somewhat brittle though, when compared
with perl, python, java, etc.

program-options and filesystem

Same as the aforementioned, these two libraries just supply a familiar programming model for these tasks
(parsing the command line and navigating the filesystem) which can be considered quasi standard today,
and is available pretty much in the same style in Java, Python, Ruby, Perl and others.

Negative Impact

Most Boost libraries are header only and all of them make heavy use of template related features of C++.
Thus, every inclusion of a Boost library might lead to increased compilation times. We pay that penalty
per compilation unit (not per header). Yet still, using a boost library within a header frequently included
throughout the code base might dangerously leverage that effect.

debug mode

Usually, when developing, we translate our code without optimisation and with full debugging informations
switched on. Unfortunately, C++ templates were never designed to serve as a functional metaprogramming language
to start with — but that’s exactly what we’re (ab)using them for. The Boost libraries drive that to quite
some extreme. This leads to lots and lots of debugging information to be added to the object files,
mentioning each and every intermediary type created in the course of expanding the metaprogramming
facilities. Even seemingly simple things may result in object files becoming several megabytes large.

Fortunately, all of this overhead gets removed when stripping your executable and libraries (or when
compiling without debug information). So this is solely an issue relevant for the developers, as it increases
compilation, linking and startup times.

runtime overhead and template bloat

The core Boost libraries (the not-so experimental ones) have a reputation for being of very high
quality. The’re written by experts with a deep level of understanding of the language, the usual
implementation and the performance implications. Mostly, those quite elaborate metaprogramming
techniques where chosen exactly to minimise runtime overhead or code size.

Since each instantiation of a template constitutes a completely new class, carelessly written
template code can lead to heavily bloated executables. Every instantiated function and every
class with virtual methods (i.e. with a VTable) adds to the weight. But this negative effect can
be balanced by the ability of reducing inline code. According to my own experience, I’d be much
more concerned about my own code adding template bloat, then being concerned about the Boost
libraries (those people know very well what they’re doing…)

some practical guidelines

Facilities like boost::format, boost::variant, boost::any, boost::lambda and the more elaborate
metaprogramming stuff add considerable weight. A good advice is to confine those features to the implementation
level: use them within individual translation units (*.cpp) where this makes sense, but don’t express
general interfaces in terms of those library constructs.

Actually, for the most relevant of these, namely boost::format and boost::variant, we have either
created a lightweight front-end or our own simplified implementation in the support library, leading
to a significant reduction in overall code size.

Beyond those somewhat problematic entities, there used to be several incredibly useful tools from the
Boost library, which create only moderate overhead — nothing to be really concerned about. Fortunately,
mostly these have meanwhile been adapted into the official standard library, and can thus be used without
creating a dependency on Boost.

the functional tools (function, bind and friends), the hash functions, lexical_cast and the
regular expressions create a moderate overhead. Probably fine in general purpose code, but you should
be aware that there is a price tag. About the same as with many STL other features.

the shared_ptrweak_ptr, intrusive_ptr and unique_ptr are really indispensable and can be used
liberally everywhere. Same for enable_if and the type_traits. The impact of those features on
compilation times and code size is negligible and the runtime overhead is zero, compared to performing
the same stuff manually (obviously a ref-counting pointer like smart_ptr has the size of two raw pointers
and incurs an overhead on each creation, copy and disposal of a variable — but that’s besides the point I’m
trying to make here, and generally unavoidable)