C/C++

Comparing C/C++ Compilers

Source Code Accompanies This Article. Download It Now.

Matthew compares nine popular C++ compilers in terms of their performance, features, and tools.

Oct03: Programmer's Toolchest

It's all about flexibility, portability, efficiency, and performance

Matthew is a consultant for Synesis Software, as well as author of the STLSoft libraries and the upcoming Imperfect C++ (Addison-Wesley, 2004). He can be contacted at matthew@synesis .com.au or http://stlsoft.org/.

Despite the advent of new programming languages and technologies, C++ is the workhorse for many developers, and is likely to remain so for a long time to come. The main reasons for C++'s prominence are its flexibility, portability, efficiency, and performance. Yes, even with the increase in processing power, software performance continues to be important, and C++ is a language thatwhen used correctlyprovides superior performance in virtually any context.

In this article, I compare nine popular C++ compilers in terms of performance, features, and tools. The compilers are either exclusively Win32 or provide Win32 variants. I conducted all studies on a Windows XP Pro machine (single-processor, 2 GHz, 512 MB) with no other busy processes. The compilers I examine are:

As for bias, I confess to having soft spots for DigitalMars, Intel, and CodeWarrior, all of which have helped me in creating the STLSoft libraries (http://stlsoft.org/). Nevertheless, my day-to-day tool of choice is not one of these.

Compilation Time

In many situations, compilation time is not important. However, it is crucial on large systems or in development situations with frequent builds (such as Extreme Programming). When compiling/linking source, important factors include the number of inclusions, use of precompiled headers, complexity of code, aggressiveness of optimization (in both compilation and linking), and size of translation units. For this article, I considered these scenarios:

2. C2. A C file with a large number (500) of include files (compilation only; no optimizations).

3. C3. A C file with a large number (100) of nested include files, each of which is included by its prior file, and then by the main file, thereby testing the effects of multiple inclusions and include guards (compilation only; no optimizations).

5.whereis. A single complex C++ file with several template and operating-system library includes (compilation only; optimized for space). This tool provides powerful command-line searching and is included as a sample in the STLSoft libraries, exercising much STLSoft code.

I used Python scripts (available electronically; see "Resource Center," page 5) to generate the source files for scenarios 1-4. The source files are very large and not included with this article. The whereis source is available at http://stlsoft.org/. (You can get the most up-to-date binary from my company's web site, http:// synesis.com.au/r_systools.html.) The source files for MMComBsc.dll contain too many proprietary goodies for me to include here, so you'll have to take my word for the figures reported.

I used ptime (http://synesis.com.au/r_systools.html) to get the results from scenarios 1-3 and 5 by executing multiple (15) times, discarding the two-highest and one-lowest results, and reporting an average of the rest. This reduces distortion from caching or startup. I executed scenarios 4, 6, and 7 using makefiles, timing the process via ptime. Table 1 presents the results.

The "Did Not Compile" (DNC) notation for CodeWarrior in scenario C3 results from the compiler refusing to process the nested include depth of 100; tests showed that 30 was the limit. CodeWarrior help says, "To fix this error, study the logic behind your nested #includes. There's probably a way of dividing the large nested #includes into a series of smaller nests"which is probably true, but may not always be so. Watcom could not compile the whereis and MMComBsc scenarios because it doesn't support templates sufficiently.

There are some significant differencesup to two orders of magnitude in some casesbetween performances. Borland comes off best, closely followed by VC++ 6, with Digital Mars and VC++ 7 about an equal third. CodeWarrior, GCC, and Intel are the sluggards of the group. (Naturally, it's not possible to create a single objective comparison criterion, even if you have an exhaustive set of scenarios. The way I've done it is to do three rankings. First, positions 1-9 are summedlowest value wins. Second, the first four positions are awarded 10, 7, 5, and 3 pointshighest value wins. Third, the first three positions are awarded 5, 3, and 1 pointslowest value wins. Only when these rankings are in accord do I talk of "best," "second," and so on.)

VC++ and Watcom are streets ahead when precompilation is appropriatethat is, when most or all of the source is C++. VC++ 7 compiled the pch scenario 43 times faster than CodeWarrior! Also, VC++ 7.1 is slower than VC++ 7.0 in every test.

Speed of Generated Code

Next, I looked at the speed of generated code, restricting myself to these five scenarios:

1. Dhrystone. This benchmark (http://www.webopedia.com/TERM/D/Dhrystone.html) tests integer performance. Since it is CPU bound (that is, there is no I/O or resource allocation within the timed sections), it is a good test of pure compiled code speed. The performance is measured as number-of-Dhrystones per second (a bigger number is better).

2. Int2string. Converting integers to string form can be a costly business. Ten million integers (0=>9,999,999) are converted to string form, and their string lengths summed (to prevent over-optimization). The two approaches I used employ different mechanisms for conversions:

· The compiler library's sprintf(). This performance reflects the difference in the efficiency of the compilers' libraries. (Intel uses VC++ 7.0 libraries.).

3. StringTok. This generates a large set of strings to tokenize, using ";" as the delimiter. It tokenizes the string, then iterates over the sequence totaling the token lengths. (It avoids over-optimization by the compiler, but maintains consistency of test data between compilers by pseudorandomizing based on the Win32 GetVersion() function, which returns the same value for all programs because they're run on one test system.) I used the boost::tokenizer<> (http://boost.org/) and stlsoft::string_tokeniser<> (http://stlsoft.org/) tokenizer libraries.

4. RectArr. To really hammer the ability of compilers to generate efficient code in complex template scenarios, I used STLSoft's fixed_array_3d<> 3D rectangular array template. I parameterized a value type of stlsoft::basic_simple_string<char> instead of std::ba- sic_string<> to promote effects of compiler efficiency and reduce differences in their respective standard library implementations. The scenario creates a variable-sized 3D array, (100×100×100) and iterates through all three parameter ranges, assigning a deterministic pseudorandom value to each element. Two approaches are performed.

The first approach conducts this enumeration once.

The second approach does it 10 times. Thus, the cost of allocating and initializing the 1 million members is amortized (and thus diluted) in the second variant, focusing instead on the costs involved with the array (template) element access methods.

5. zlib. This is a library featured in many applications (http://zlib.org/). It seemed a valuable, and uncontrived, performance test. The test program memory maps a given file, memory maps a corresponding output file, and then, within a timed loop, compresses the entire contents of the source file. I compiled both zlib 1.1.4 source and the test program with the nine compilers, and executed it on both large (65 MB) and small (149 KB) files.

Other than the Dhrystone scenario (the implementation I used has its own internal measurement mechanism), all scenarios derive their timing behavior from WinSTL's performance_counter class (see http://winstl.org/ and my article "Win32 Performance Measurement Options," Windows Developer Network, May 2003; http://www.windevnet.com/ documents/win0305a/), which times the appropriate internal loop. Each has a warmup loop so that the results reflect pure code performance, rather than being influenced by operating system or other effects. All scenarios were optimized for speed (-O2, -opt speed, -o+speed, -O3, -O2, -O2, -O2, -ot). Table 2 presents the results.

Except for the Dhrystone scenario, I executed within a custom test harness that ran them nine times, discarded the highest and lowest times, and averaged the remainder. The source code for all scenarios is available electronically.

The "DNC" for Digital Mars is because Digital Mars is not supported in Boost 1.30, which I was using. Boost/Digital Mars compatibility is underway, and may be complete as you read this article. The multiple DNC entries for Watcom reflect its general lack of template support.

Intel is streets ahead of the rest, being fastest in two scenarios and second in five. (Indeed, its only poor performance is in the Int2String(sprintf) scenario, in which its performance is heavily dependent on VC++ 7.0 run-time library's sprintf()). Second come Digital Mars, VC++ 7.0, and VC++ 7.1, all about even. Considering that Digital Mars has the Boost no-show, it's a creditable overall performance.

By virtue of its no-show in five scenarios, and very poor performance in two others, Watcom takes the wooden spoon. However, it wins the Int2String(sprintf()) scenario, so things aren't all bad. Borland and CodeWarrior do well in a fewBorland is quickest in zlib (large)but let down in other areas. GCC performs badly all around, except for the two STLSoft variants.

It's worth noting the differences between the variants of the Int2String and StringTok scenarios. Using STLSoft's integer_to_string<> template provides significant performance advantages, with execution times being between 15 and 55 percent of those of sprintf(). The string tokenizers exhibit considerable differences: The execution time of STLSoft's tokenizer is between 6 and 26 percent of Boost's.

Size of Generated Code

Execution speed is not always more important than size, nor do speed optimizations always provide faster executing processes, since larger code is more likely to undergo cache misses and require consequent virtual memory activity by the operating system. (I always optimize for size, and only for speed based on the results of testing. I'm in good company. In Debugging Applications for .NET and Windows, John Robbins reports that Microsoft optimizes for size on all operating system components.)

In any event, you always prefer smaller code. In Table 3, which focuses on module size, VC++ wins hands down. VC++ 7.0 produces the smallest code, followed by VC++ 7.1, and then VC++ 6.0. Intel, Digital Mars, and Watcom acquit themselves reasonably well, taking one scenario each. Borland and CodeWarrior don't do too badly, except where it really matters in the one sizable, real-world project. The jaw-dropping miscreant is GCC, with modules up to 10 times the size of the leader in some scenarios.

Language Support

Compiler support for language features is also important. Since there are a huge number of features that are (not) supported by modern C++ compilers, I focus on those I know of and am interested in; see Table 4.

Having wchar_t as a built-in keyword is not that important, since it can be easily, portably, and robustly synthesized via the preprocessorusually with a typedef from unsigned short;. However, this does reduce overloadability. The __func__ predefined identifier is nice for debugging infrastructure, but again, there are workarounds.

The importance of floating-point precision is not as easily dismissed (see "How Java's Floating-Point Hurts Everybody Everywhere," William Kahan and Darcy, http://http.cs.berkeley.edu/~wkahan/JAVAhurt.pdf). Only Borland, Digital Mars, GCC, and Intel (with option -Qlong_double) provide long-doubles that match the Intel architecture's 80-bit capabilities. For serious numerists, this will be important.

Static assertions are also important, since they facilitate checking of invariants at compile time, rather than run time. They are based on the illegality of zero or negative array dimensions (int ar[0];, for example) and are usually wrapped up in a macro such as:

#define stlsoft_static_assert(_x)

do { typedef int ai[(_x) ? 1 : 0]; } while(0)

The Digital Mars is the only compiler that does not support them, although it will do so from Version 8.35 onwards. Note that neither Borland 5.5(1) or 5.6 are able to optimize them out of code, leading to performance costs.

Variable-length arrays (VLAs) and dynamic application of the sizeof operator are C99 features. Only Digital Mars and GCC support them. Except for VC++ 6, all compilers support covariant return types.

Koenig lookup is a useful mechanism (see my article "Generalized String Manipulation: Access Shims and Type Tunneling," C/C++ Users Journal, August 2003), whereby operations associated with an element from one namespace may be automatically accessed from another without namespace qualification. VC++ (6 and 7), Watcom, and, except with nondefault parameter, Intel, do not support this.

The for-scoping rules were changed in C++.98 (ISO/IEC C++ Standard, 1998), and all compilers except VC++ 6 and Watcom support the new syntax. Interestingly, Intel gives a warning when used correctly, but in a way that would fail to compile under the old rule. (In my opinion, you should never write code that relies on either old or new rules, so this should not occur in production code. This warning is useful to avoid doing so.)

All the remaining issues involve templates. Though it does have some template support, Watcom fails on all remaining tests. VC++ 6 and 7 fail on the important facility of partial specialization, although Version 7.1 provides full support.

One VC++ 6 weirdness is that it won't accept the typename qualifier within the default parameters of a template, something other compilers (GCC and CodeWarrior, for example) mandate. You must resort to the preprocessor to support them all. (Conformance masochists can check out the definition of ss_typename_type_def_k in the STLSoft headers.)

Except for Digital Mars, VC++ 6.0, and Watcom, all compilers support template templates. This technique isn't currently widely used, but is useful and will be more so in the future. Future compilers need to support it.

Overall, GCC is the clear winner. Since I think static assertions, 80-bit floating-point, and Koenig lookup are more important than VLAs and __func__, I would put Borland second, CodeWarrior and VC++ 7.1 third, Digital Mars fourth, and Intel fifth. I would rate all of these as good compilers. Next come the other VC++s and Watcom. Once the next version of Digital Mars is released, it will likely have a perfect score, too. However, you can expect likewise from other vendors soon, since language conformance has become a marketable feature once more.

One feature not included in Table 4 involves typedef templates (see "The New C++: Typedef Templates," by Herb Sutter, C/C++ Users Journal, December 2002), which none of the compilers support. VC++ 7.1 does report that "error C2823: a typedef template is illegal," which suggests Microsoft intends on supporting it soon, possibly in Version 7.2.

Features

The Standard Library. Except for Digital Mars and Watcom, all compilers support the C++.98 Standard Library without major problems. Digital Mars C++ comes with both SGI's STL and the latest STLport (http://www.stlport.org/), but has yet to update to the new header names (<iostreams> rather than <iostreams.h>) and to have things declared within the std namespace. Both compilers are working toward full conformance.

ATL. As far as I know, all compilers other than GCC and Watcom support ATL, although I suspect that some only support Version 3, not 7. I'm not aware of any language-support issues preventing GCC from supporting ATL, but I haven't tested this. I am certain that Watcom could not support ATL, because of its current template deficiencies.

Boost. Boost is a library suite supported by all the compilers except Digital Mars (where support is coming), Visual C++ 6 (which has limitations), and Watcom (which is not supported at all).

Managed C++. Only VC++ 7.x supports Managed C++. In the context of this article, however, Managed C++ isn't a bona fide feature because Managed C++ is not C++, any more than C is C++.

MFC. Despite showing its age, MFC is a widely used (and occasionally useful) framework. It is fully supported by Visual C++, Intel C++, CodeWarrior, and Digital Mars. I also understand it is available with Borland C++ Builder, though I have not used it with that compiler. To my knowledge, neither Watcom nor GCC support MFC.

STLSoft. Except for Watcom, all compilers support most of the STLSoft libraries, and even Watcom supports a sizable part (where the templates are within its capabilities). STLSoft is bundled with Digital Mars 8.34 upwards.

Win32/Platform SDK. All the Win32 compilers support the Win32 API (including many Microsoft language extensions, such as __declspec()), although some do not support the version that comes with the Platform SDK (various constructsincluding inline assemblerare not recognized). Specifically, GCC and Watcom do not support the February 2003 version of the Platform SDK.

16-bit. Both Digital Mars and Watcom support 16-bit targets. Demand for this is low; but if you need it, it's good to be able to get it somewhere.

WTL. I recently did some work to get various compilers (including VC++ 5!) to work with WTL, but have not reached a definitive conclusion as to support. What I can say is that VC++ and Intel work with it out-of-the-box, CodeWarrior with a little work, and Borland and Digital Mars with a fair bit of effort.

I'd like to mention one thing I'm fond ofthe Digital Mars -wc flagwhich warns about all C-style casts within C++ compilation units. This is great when sifting through code to find areas that need "modernizing." (Also, the author of the Digital Mars compiler added this feature on my request, and did so within an amazingly short turnaround.) It would be nice to see this in other compilers.

Tools

Most of the compilers come with Integrated Development and Debugging Environments (IDDEs). Since editor/IDDE preferences are nearly as religious as those for bracing style, I couldn't hope to do a balanced jobeven if I knew all the salient features of each environment. While I have some experience with all the IDDEs, and can say that they all provide the minimum functionality required to create/edit projects and source and debug executables, some are (to be candid) pretty basic. The Digital Mars and Watcom IDDEs aren't going to pry many programmers from their favorite environments.

Perhaps anachronistically, my editor of choice is the Visual Studio 97 IDDE, which I use because I know the keystrokes, can write some useful wizards/plug-ins/macros, and it's quick, doesn't require the use of the mouse, can debug, and doesn't crash. I have experience with the IDDEs of C++ Builder, Digital Mars, CodeWarrior, and Visual Studio 98, and .NET, and they either have too many or too few features, crash, or make me take my hands off the keyboard. Nonetheless, I know people who swear by them, so it's a case of to each his own.

Conclusion

Clearly, you cannot simply say compiler X is superior to all others. Most compilers are superior to others in one or more respects.

Borland is the fastest compiler, has good language support, and doesn't shame itself in any other regard. Though it's not one of my compilers of choice, it has value to contribute in its good warnings. However, it seems to share VC++'s predilection for internal compiler errors (ICEs) when things get too hard for it, which I find annoying.

For me, CodeWarrior is the last word in language rules, conformance, and error-message readabilityand it is invoked whenever another compiler balks at something I've written that I think should be okay. Moreover, it has a good (though not great) IDDE, produces reasonably efficient code, and has support for all the popular libraries. It would be the compiler I'd chose if I could only have one.

Digital Mars comes off very well, featuring in the top three for language support, compile time, and execution speed. (It's about fourth on execution size.) For now, it is let down by nonstandard Standard Library support, the occasional ICE, and an outdated IDDEbut it's free and you get such good service from its developer that these things are forgivable.

GCC has the best language support, which is commensurate with its having the widest collaboration of any open-source (and probably commercial as well) compiler in the business. However, this may also account for its poor efficiency characteristics. On my tested scenarios, it proved to be the slowest compiler, and produced the slowest and fattest code. (This can be contrasted with Digital Mars, which is written and maintained by one person.)

For speed of generated codewhich is often the most important factorIntel reigns supreme. It does well on the size of generated code, although it is let down somewhat by compilation speed. It scores well on language issues, although getting it to listen to warning-suppression commands is sometimes a trial. It does not have any kind of IDDE, but plugs into Visual Studio (98 and .NET) without problems, other than that precompiled headers and browse information go to pot. I'm always surprised when I speak with clients who are writing performance-sensitive software in a Microsoft environment (Win32 systems, Visual Studio) and yet have not considered using the Intel compiler. It's definitely an essential part of a programmer's armory.

Visual C++ does better than I expected. It produces the smallest code, is quick to compile, and does well in the performance of generated code (though it's still way off Intel's performance). It also, finally, has good language support; VC++ 7.1 even discovered some typename qualification errors in the STLSoft libraries that CodeWarrior and GCC missed. And, in my opinion, it has the best IDDE by a country mile. It is/has been badly let down by poor language support (now impressively, though belatedly, addressed in Version 7.1), long update cycles (it was five years between Versions 6 and 7), ridiculous gigabyte install sizes, and by assumptions in the compiler and the IDDE that you will be using Microsoft extensions, MFC, and so on. Furthermore, there are too many ICEs for my liking (even one should be an embarrassment), which makes you question how much testing goes into it. Finally, WTL should be a fully supported part of Visual Studio.NET, wizards and all.

Watcom is poor on template support, which made it hard to give it a fair comparison with all the other compilers. It did well in the areas in which it featured, although Dhrystone performance was extremely poor. I hope it will continue to develop in its Open Watcom guise, and return to the glories that were Watcom C/C++ in the mid '90s.

All C++ professionals and organizations should use more than one C++ compiler at all times. No compiler provides all the possible useful warnings, so compiling with multiple compilers affords much more comprehensive coverage. Furthermore, there's nothing better for making your code port-prepared than making it work with more than one compiler. Particularly bad in this regard have been the Borland and Visual C++ compilers. Many times, I've built substantial projects that have worked well in one of these compilers, only to find in porting that I've done nonstandard things, and even had bugs. So, I would recommend that all C++ developers modify your work environments to incorporate three or four compilers; you'll find your code quality rises sharply. Given that many of these compilers are freeBorland 5.5(1), Digital Mars, GCC, Intel (for Linux), Watcomit seems remiss to do otherwise.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!