Introduction

Updated 2007-03-25: See end of article for details.

Ever designed a simple utility program, such as a hex-dump program, only to find your simple program is a full 64K, optimized for size, when all it does is read a file and print to stdout? Ever wonder what happened to those good ol' DOS days where programs had to be small? Where a COM file was limited to 64K? Or when you can write a bare-bones DOS-style protected mode operating system kernel in about 64K?

Well look no further. Here I will examine what causes this code bloat, and what can be done to fix it.

Intended Audience

This article is aimed at programmers who like to have control over every little detail. It is also geared towards small portable utility-like programs, where a DLL CRT is undesirable because of the need for a second file and installation program, and where the overhead of a statically linked CRT is much greater than the core program code.

Of course, by replacing the CRT, programs that rely on specifics of the Microsoft CRT will fail. For instance, if you go digging into the FILE structure, or expect a certain header on your memory allocations, or rely on the buffering features of stdio, or use locales, runtime checks, or C++ exception handling, you can't use this library. This library is aimed for use by small, simple programs, such as a hex-dump command line program or the many UNIX-style tools like cat or grep.

Many C/C++ purists will take offence at my suggestions, because the C runtime is, to them, something that shouldn't be tampered with. But bear with me, because although you might never use any of this article's information, it should at least give you an insight into how your program works.

Where's Bloat-o?

(really bad pun, I know...)

The source of this 'code bloat' is very easy to find by looking at a linker-generated map file. Here is a snippet from the demo programs' map file:

As you can see, it includes "two" functions from my program, and over "two hundred" functions in the C Runtime (CRT).

Notice that one of the functions is even ___crtCorExitProcess, a function that is used by a C++/CLI program! Other gems include multithreading support, locales, and exception handling - none of which are used by my program!

And this is with Eliminate Unreferenced Data and COMDAT Folding on!

Where do I begin?

I will first highlight the various tasks performed by the C Runtime to give the reader a better understanding of the 'magic' that happens in C and C++.

Let's start by configuring the linker to Ignore Default Libraries. Compile. I was greeted with this:

mainCRTStartup

Where does your console program start? Did I hear you say main? If you did, you said what I would have said before journeying into the inner Stationworkings of the C Runtime.

Windows isn't nice enough to provide your app with a ready-made argc and argv. All it does is call a void function() specified in the EXE header. And by default, that function is called mainCRTStartup. Here is a simple example:

We start by creating argc and argv, which we later pass to main. But before we do that we have to take care of some things, like calling the constructors for static C++ objects.

The same thing happens in GUI programs, except the function is called WinMainCRTStartup. And for DLLs, the true entry point is _DllMainCRTStartup. Unicode programs look for wmainCRTStartup and wWinMainCRTStartup respectively. DllMain appears to stay the same.

C++ Magic

The constructors of static objects don't just call themselves. And Windows is certainly not going to call them for us. So we have to do it ourselves. What do I mean?

C++ programmers should automatically expect the output of this program to be:

StaticClass constructor
main
StaticClass destructor

Matt Pietrek has a great explanation in his article, mentioned earlier, under the heading "The Dark Underbelly of Constructors", so I will not bother going into that level of detail here. Suffice it to say that the compiler emits pointers to the constructor functions (actually thunks to constructor functions) in a special ".CRT" section in the object file, which is later merged with the ".data" section. By declaring a pointer to the start and the end of this section, the _initterm function is able to iterate over these pointers, calling each constructor in turn.

The constructor thunk function also registers an atexit callback to call the destructor of the object. Thus the mainCRTStartup function above goes to the trouble of creating an atexit table. The _doexit function is responsible for calling these functions.

Standard Functions

So now we have taken care of the program's entry point. What about the other functions?

printf and Family

One of the more complex tasks performed by the C Runtime is parsing the printf format string. (I'll admit it's not terribly complex; it's just non-trivial compared to strcmp) To save space, we can offload this processing to the Windows function wvsprintf. No, that's not a wide-character version. The w probably stands for Windows.

File I/O

Originally I had planned to eschew the FILE structure altogether - and instead just use a HANDLE cast to a FILE*. But this would have only given me two bits of information. As I added functionality to the library this ideal solution became less ideal when I needed to store an end-of-file flag, text-mode flag, and possibly other data. And besides, not using the FILE structure means that the stdin, stdout, and stderr identifiers don't work! So now I (ab)use the FILE structure.

Because I cannot change the FILE structure itself (it is defined in stdio.h) I have to use its fields to work with my data. A very ugly solution. But this library isn't intended to be pretty. NOTE however, that this means code that relies on internal fields in the FILE structure will crash. But then again, you shouldn't be messing with internal data structures anyways, right?

fread and fwrite are substantially more complicated than this, because they must translate '\r\n' combinations to '\n' only. For brevity, I will not discuss the algorithm - see the source code if you are interested.

String functions

Replacing the CRT means no more strlen, strcmp, or even memset. These must be implemented from scratch. Thankfully, they are not difficult to implement - just tedious. Care should be taken to handle NULL pointers and other special cases described in the MSDN documentation.

Wide Character (Unicode) Support

This is the major new feature in this library. It is still under development and hasn't undergone extensive testing yet.

As suggested by Hans Dietrich I have started to add wide-character support to the library. Basically that means implementing wide-character versions of various functions.

Uppercase and lowercase

When dealing with ASCII, functions like isalpha, toupper, and strlwr are trivial to implement. But as soon as Unicode enters the picture, they become much more complicated. There are different rules for uppercase versus lowercase and alphabetic versus numeric, so some operating system help is in order. To fix this problem, the function GetStringTypeW is used to implement the isXYZ family of functions, and the functions CharUpper and CharLower are used to implement toupper and tolower, respectively.

File encoding

Up until VS2005 even the Unicode file library functions could only write ASCII characters. Output to wprintf, fwprintf, and fwputs in text mode are all translated from Unicode before it is written to the file.

Because adding support for UTF-8, UTF-16, and other forms of file encoding would just add bloat to this library, I have made the decision to not include it. The behavior will remain compatible with the pre-VS2005 CRT. If you need to deal with file encodings, you probably need the full CRT anyway.

Why are you adding all this stuff? Why not keep it simple!

Simple: Only the stuff that you call is included in a release build!

But then why is Microsoft's CRT so bloated if you don't call much stuff? Again - because you do, but don't know it. The CRT startup code itself calls lots of functions that in turn call other functions - and a lot of it is garbage that isn't needed 90% of the time. Locales, exception handling, etc. have their place, but not in all programs. If your program doesn't use it, why should it have to pay the price of Microsoft's startup code using it?

The startup code and various functions in this CRT library are designed to rely on as little functionality as possible. Thus only the essentials are included.

Using the code

Add the tlibc (Tiny Libc) project to your project's solution, and add it as a referenced project. Alternatively, compile the library and add it to your project's linker options.

Because we are replacing the default CRT, C++ exception handling and SEH will not be handled properly. So don't use it! You will also need to turn off Buffer Security Check, set Runtime Checks to default, and disable Runtime Type Information.

Results

After recompiling the program with libctiny and the method above, the EXE jumped from a giant 64K to a much more reasonable 4K! (4096 bytes to be exact). For comparison, the entire code section of the linker map file is reproduced below:

2006-08-12

Submission to CodeProject

Comments, complaints, questions, etc. are welcome. Please let me know if you actually use this for something. If you need a function that is not included in this library, let me know and I will update the code. Comments on my 'comments' are also welcome.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

After a few years on the Dark Side, he reformed and now chants "Death to VB." His computer-related interests include C++, C#, and ASP.NET (in C#, of course). He writes operating systems in C++ and assembler as a hobby.