Introduction

We all know how painful memory management in C++ is, don't we? ... Of course, there are solutions like reference counting with smart pointers, but the programmer should exercise great care not to introduce cyclic references. Programmers that use garbage-collected languages laugh at us, don't they? .

After some articles I have written the previous years about C++ and garbage collection (see here and here), the problem of memory management came up again in one of my personal projects, so I revisited my code, found out the bugs I had, redesigned it, and the result is in the attached file.

LibGC is a thread-based garbage-collection library with the following features:

Library Design

The library is created such that it is easy to use, especially by people who are used to using special pointer classes for managing memory.

The library provides thread-based garbage collection. What does that mean? It means that each thread can have its own collector. The reason for this decision is two-fold:

I did not want to introduce dependencies on thread libraries, and

if an application has many threads, then all these threads will compete for the collector, thus slowing the application down.

I think it is better that each thread has its own collector; threads can exchange data with synchronized queues.

The library, as it is provided, only supports one thread, the main thread. If you want to provide thread support, you have to define a global function that returns a different collector for each thread, using thread local storage. It is not too difficult, and I will write an example for that later.

The main classes of the library are:

garbage_collector - the class that does garbage collection and maintains the garbage collection context.

Using the Library

The library does not contain source files; it only contains header files. Unzip it in a directory of your choice, then include the file libgc.hpp in your project. The zip file contains the code documentation.

Declaring Garbage-Collected Classes

You have to use the class member_ptr<T> in order to declare garbage-collected pointers. These members must be initialized with an owner object, so that the collector has a precise picture of an object's layout. You can also initialize members:

Declaring Garbage-Collected Pointers

Global and stack pointers to garbage-collected objects must be declared with the class ptr<T>. This class automatically registers itself to the current garbage collector on construction, and un-registers itself on destruction. Registration happens using a single-linked list as a stack, so it does not take too much code. This is necessary so that the collector knows where the pointers are. Here is an example:

int main() {
ptr<Foo> foo1 = new Foo;
}

Weak Pointers

Sometimes, it is necessary to have pointers that do not cause objects to be kept around, and that are nullified automatically when the pointed object is collected. These pointers are called weak. The library provides both member and root weak pointers. Usage is exactly the same as strong pointers:

When garbage collection happens, all root and member weak pointers that point to the collected objects are set to NULL.

Arrays

The library also provides the capability of declaring garbage-collected arrays of garbage-collected objects. The declaration is very simple:

int main() {
ptr<array<Foo> > array1 = new array<Foo>[10];
}

You can then handle the array as a normal C++ array:

array[0] = new Foo;
array1[0]->action();

Weak arrays are the same as arrays, but they contain weak pointers instead of strong pointers.

Customization

The functionality described above is not hard-coded into the garbage_collector class. In fact, that class only provides the absolute minimum interface for handling finalizing and marking. The actual algorithms for finalization and marking are provided by user-defined classes. The library's classes are coded in the same way. Check out the documentation of the class for how to call the methods malloc, free, mark, and mark_weak in order to find out how to customize certain aspects of the library.

If you want super-fast scanning of member pointers, for example, without using special classes, then you can provide your own mark routine:

External Classes

The library provides a wrapper class for those classes that are not garbage-collected. For example, you can make an STL list garbage-collected, like this:

new GC<std::list<int> >;

The Garbage Collection Algorithm

The algorithm used internally is the mark & sweep algorithm: first the root set is scanned for live objects, then objects that are unreachable are finalized and destroyed. Upon exit, all remaining objects are finalized and destroyed.

Garbage collection happens automatically from inside the allocation function, when the number of allocated bytes exceeds the collector's limit. The method garbage_collector::limit() can be used to retrieve the number of bytes set as limit, and the method garbage_collector::set_limit(size_t size) can be used to set the limit. The default value for the limit is 64 MB.

The number of bytes that are allocated are retrieved by the method garbage_collector::alloc_size().

Semantics

The library follows the C++ semantics closely; be careful to initialize pointers (global/stack, members, arrays) to meaningful values before using them, otherwise your application will most certainly crash. The rationale is that, in C++, you do not pay for what you do not use, so this principle is followed by this library too.

You can always delete objects and arrays, using the operatordelete.

Performance

What I described above is, of course, not the fastest possible collection; I consider it, though, more than enough to solve my memory management problems. Maybe, that's valid for you. If not, there are other collectors around which are industrial strength, like this one.

Restrictions

Of course, there are some restrictions:

you must never set a pointer to point to an object that is a member (as a value) of another object. Pointers shall point only to heap-allocated objects.

You should never allocate garbage-collected objects on the stack.

Conclusion

Not many words...only happy coding!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Sorry for the delayed reply, I had some obligations to fulfill (family etc).

If people are interested in it, then why not?

But I think that the library has some bugs, or I'd rather say, some cases not covered; for example, when returning a root pointer as a result of a function.

Temporary objects returned by functions are destroyed after the lvalue is constructed, as you already know. So, the following code, in file garbage_collector.hpp, line 382, does not work:

void pop_ptr() {
m_ptrs = m_ptrs->m_next;
}

If you apply the above to the following code:

ptr<Foo> getFoo() { returnnew Foo; }
ptr<Foo> foo = getFoo();
}

The variable 'foo' will be constructed before the return value of getFoo() is destroyed, thus, a root pointer cannot be popped from a single linked list, as if the list was a stack.

There are solutions to this:

a) root pointers may become nodes in a double-linked list (perhaps using boost::intrusive).
b) the single list of root pointers is searched, from the most recent to the oldest pointer, for the pointer to be removed. Usually, the returned ptr is just below the topmost element of the root pointer stack.

This collector works in reverse: instead of starting from the roots, the search is started backwards, from the objects, towards the roots. If a root is found, then the whole object tree is reachable, otherwise objects in the tree are deleted.

With a little work, this collector may become a more viable solution than the collector of this article. The work required is to make it lazy, i.e. to not scan the object graph for reachable objects until required.

Furthermore, there is also the collector I have posted many moons ago, in boost vault, which does the following: it puts the pointers and allocated blocks in unsorted containers, and only sorts them (by address) when a collection is required. Then, the collection of pointers and blocks is traversed at the same time, and root pointers are those that are located outside of blocks.

Finally, there was another collector that I had posted to the vault, but it is not there any more, that used bitfields: for each pointer or block, it allocated one bit, in a two-level structure of bitmaps. At the time of collection, this structure was traversed, and root pointers were located outside of blocks, as per the above algorithm.

Let me know if you're interested in any of these ideas, and we may cook something interesting, if you wish.