We examine the costs and benefits of a variety of copying garbage
collection (GC) mechanisms across multiple architectures and
programming languages. Our study covers both low-level object
representation and copying issues as well as the mechanisms needed to
support more advanced techniques such as generational collection,
large object spaces, and type-segregated areas.

Our experiments are made possible by a novel performance analysis
tool, Oscar. Oscar allows us to capture snapshots of programming
language heaps that may then be used to replay garbage
collections. The replay program is self-contained and written in C,
which makes it easy to port to other architectures and to analyze with
standard performance analysis tools. Furthermore, it is possible to
study additional programming languages simply by instrumenting
existing implementations to capture heap snapshots.

In general, we found that careful implementation of GC mechanisms can
have a significant benefit. For a simple collector, we measured
improvements of as much as 95 percent. We then found that while the addition
of advanced features can have a sizeable overhead (up to 15 percent), the
net benefit is quite positive, resulting in additional gains of up to
42 percent. We also found that results varied depending upon the platform
and language. Machine characteristics such as cache arrangements,
instruction set (RISC/CISC), and register pool were important. For
different languages, average object size seemed to be most important.

The results of our experiments demonstrate the usefulness of a tool
like Oscar for studying GC performance. Without much overhead, we can
easily identify areas where programming language implementors could
collaborate with GC implementors to improve GC performance.