Running Code Before and After Main

There are many reasons why sometimes you'd like to run some code before, or after, the main function of a program or DLL (main, WinMain, and DllMain are equivalent for this article, so I will generalize them as "the main function" from now on). This is not as simple in C as it is in C++. In C++, you can just declare a global or static instance of an object anywhere, and its constructor and destructor get called before and after the main function, respectively.

This article describes how to use a partially documented feature from the Microsoft C runtime library's (CRT) initialization and termination code to make your code execute before and after the main function. Bet let's start with a simple example to show how it works. In Visual Studio, create a project for an empty Windows program, add a "prepostdemo.c" file, and paste the following code into it. There's no need to change settings, link extra libraries, or do anything special; this is an ordinary self-contained C program.

When you execute this program, you will see that PreMain1 and PreMain2 get called before WinMain, and PostMain1 and PostMain2 get called after WinMain (I used two pre- and post-functions here, but you can specify any number). On the next page, I will explain how it works.

Running Code Before and After Main

How the CRT Runs "Optional" Code

Static and global C++ objects are constructed and destroyed by some obscure undocumented assembler functions (I won't go into that here). Parts of the C library, such as the stdio library, have to be initialized and terminated too. So, how does the runtime library detect whether you have any C++ modules, or whether you are using e.g. stdio? The answer is: It doesn't!

Moreover, it doesn't want to know. If there would be any references in the runtime library to, let's say, the stdio library's initialization and termination functions, the linker would link the stdio library (or at least part of it) into the executable file (EXE or DLL), even if your program never uses stdio. But still, if the program does use CRT code that needs initializing; that initialization needs to occur automatically without the need to recompile the CRT. Here's how.

The CRT uses a function called _initterm that looks similar to the following function. The actual function is in crt0dat.c. All CRT code in this article was rewritten and paraphrased to prevent problems with copyright and licensing.

The function walks through a table of function pointers, skipping all the NULL pointers and calling all the non-NULL functions in succession. (Note that in the program on Page 1, I used functions that return int, and _initterm calls the functions as if they return void. I will explain the discrepancy at the end of the article.) The parameters are the start of the table and the end of the table; the function pointer at the end pointer is not executed. Inside the CRT, _initterm is called at least four times: twice before, and twice after the call to the main function, similar to this:

It looks as though the function gets called with pointers that don't seem to have anything to do with each other, and point to tables that only contain NULL pointers so they would be useless anyway (after all the function skips NULL pointers). Code analyzing tools such as Lint would probably have a field day with this code, but that's because they would probably skip the mysterious, but essential #pragma lines, and because they are unaware of how the linking process works.

Link-Time Tables

Via the #pragma data_seg(...) command, you can tell the compiler to instruct the linker to store data in a different segment of the .exe or .dll file than the standard data segments. (The term "segment" reminds one of the old days of segmented memory, but has little to do with it anymore; it's just a term to define different sections of an executable file.) The linker gathers all the same-named segments from all object modules and stores them together in the .exe or .dll file as one block of data. There is no guarantee in which order the data ends up within the segments, but the linker always make sure that the segments are stored in alphabetical order.

The CRT relies on the linker to build a table in the executable file that can be used by functions such as _initterm, and following is a list of what it needs to do for it. You can use the same principles to build your own tables at link-time; just make sure that if you're not going to use the _initterm trick, that your segment names start with a period ('.') and are 8 characters or less, and of course that they don't clash with any reserved names (see the linker documentation) or with the segments that _initterm uses. You're pretty safe if you use names that start with something like ".USER" or ".USR", e.g. ".USER001".

Create at least two segments with names that are in ascending alphabetical order. The segment names should be sufficiently unique that no other segments would accidentally end up in between them, but sufficiently different to be able to intentionally "plug in" other segments. The CRT uses sets of two names that are 8 characters and end in 'A' to 'Z'; this provides space for 24 segments of actual data.

Place some data in the first segment, as a placeholder for the start of the link-time table, and place some data in the last segment, as a placeholder for the end of the link-time table. The start and end segments need to have at least some data in them; the linker will discard empty segments. The data has to be initialized at compile time, even if it's NULL. Later on, I will explain why.

Only one module should put any data in the start and end segments because it's not predictable in what sequence that data is stored inside each segment, i.e. whether additional data in the start segment and end segment will end up between the placeholders or not.

Any module that wants to add data to the table can "plug in" that data by putting it into one or more segment that have names that end up (alphabetically) between the start segment and end segment.

There is no way to predict the order in which data from separate modules will end up in the segment. However, it's possible to define tables in the C module to guarantee placing order (as demonstrated in the sample program on Page 1), or "plug in" more than one segment between the start and end segments (this is what the CRT does to make sure the C++ library gets initialized before the C++ constructors are called).

The code should be aware that the linker may pad the data to align it to 4-byte or 16-byte borders or something. As long as the linker aligns the data on a multiple of what your code expects, and as long as your code knows how to skip NULLs or zeroes that are inserted by the linker, it will execute succesfully.

Obviously, the programmer is responsible for making sure that all the data in the link-time table is of the same type. Nothing will stop you from putting a char into the table from one module and a 64-bit integer from another module.

Plugging into the CRT Link-Time Tables

As you can see from some of the code in this article, and from any map file that you can generate from the link options of Visual Studio, Microsoft named the segments that are used for initialization and termination ".CRT$Xpq" where p is the category or group ('I'=C init, 'C'=C++ init, 'P'=Pre-terminators and 'T'=Terminators), and q is the segment within that group: 'A' is the first and 'Z' is the last. Obviously, the order in which groups are stored in the executable file doesn't matter (the 'I' group is stored behind the 'C' group even though the 'I' group is used first), as long as the segments within each group are stored in sequence, and we can rely on the linker that they are.

The way that Microsoft uses the segments within each group is documented in several files of the CRT source. The most interesting CRT source file in that respect is defsects.inc, an Assembler include file. This file declares a simple Assembler macro to set the current segment to one of the ".CRT$X.." data segments, and follows it with a block of comments that show how the CRT engineers thought of using the segment groups. All the initializer function pointers are stored in segment names that end in 'C' or 'L', and all terminator function pointers are stored in segment names that end in 'X'. That's convenient because the 'U' for "User" (which is also mentioned in defsects.inc) fits right in there, so that if you would put function pointers in the 'U' segments (i.e. ".CRT$XIU" and ".CRT$XTU" like in the sample program on Page 1, or ".CRT$XCU" and ".CRT$XPU" if you want your functions to be called after all C++ constructors are called and before the destructors are called), the CRT will be in an initialized state, and this is important: Even though your functions run outside main, they can still depend on the runtime library functions, both during initialization and termination!

Important Considerations

The defsects.inc file doesn't explicitly reserve space for user function pointers; it's just a comment, not actual code. Also, none of the MS CRT code uses any .CRT$X?U segments (where ? is any character). That's why I call this trick "partially documented:" The CRT doesn't explicitly support the 'U' segments in its code. It just happens to work because the MS engineers were friendly enough to:

Name the start and end segments far enough apart to be able to plug in your own segments,

Name their own "plugin" segments in such a way that the 'U' at the end is always "safe" with respect to CRT initialization, and

Leave a hint in the defsect.inc comments about the usage of those 'U' segments.

You can use any number of functions in your program, and you can even let a function decide that a later function shouldn't run, by NULLing out its pointer in the link-time table. And the declaration doesn't even have to be a table: a static or global pointer will do fine too. Any number of modules can use the 'U' segments and add pointers to them, but it's not possible to predict in which order each module's 'U' segment data ends up in the .exe file, i.e. in which order your functions will get called.

If this is a problem, i.e. your initialization function does depend on another module to be initialized, you can let initializers call each other, and erase their own pointers so they only get called once, as demonstrated in the following sample code:

Note that there is no race condition around resetting the pointers, i.e. there is no chance that another thread may call the same initialization/termination function between the point that it checks the pointer and resets it: All programs initialize as a process with exactly one thread, and the CRT guards the termination code against multi-threaded execution too. Of course if, in addition to using the 'U' segments, you also call your initialization code directly, when main is already running, you're on your own. But that shouldn't be an issue.

It is important that 'U' segment initialization and termination functions (and the pointers to them) are defined in the same module as the data that they're supposed to initialize; remember that the linker has to determine what functions and data refer to each other and therefore can't be discarded. You don't want to interfere with that. Also, it's probably not a good idea to set the compiler options to function-level linking; the system may incorrectly detect that a function is never called, and that only pointer (or table) that it's referred from, is never used, so it might discard both the pointer (or table) and the function itself, and you will never notice it because the program will start up just fine. If you want to prevent that from happening, you can use the mechanism of resetting the pointers to assert (during main) that your initialization functions have run, but of course that won't work for any termination code.

As I mentioned above, code analyzers such as Lint may have trouble detecting that the function and the pointer or table are actually used. You may have to trick them in some way into not generating a warning, especially if you have colleagues who might read the warning, don't know what you did, and just delete the code. Once again, the program will compile and link just fine, with or without the magic code, so you won't detect that it's gone until it's runtime and you start wondering why your data isn't initialized or the program doesn't clean up after itself.

Finally, here are a couple of warnings about #pragma data_seg:

Don't forget to reset your data segment to the default (as demonstrated in all the code in this article). If other data would end up in the link-time function table, it would make for a messy crash.

Data in non-standard segments always has to be initialized in your source file, even when it's zero or NULL (e.g. void (*ptr)(void) = NULL;). The standard rule of "no explicit initialization means initialize to zero" only works in the standard uninitialized data segment (called ".bss"). The data in the .bss segment doesn't exist until runtime (i.e. it doesn't take up any space in the .exe or DLL file), and gets initialized at runtime; the data in all other segments has to exist at link time.

Conclusion

It's very easy to introduce problems that are hard to detect and debug using this trick (although you can put breakpoints into your 'U' segment functions), and that's probably one of the reasons why Microsoft didn't document it. But, it's useful in many situations where you just don't have access to the main function. Use with care!

Microsoft reserves the right to change the internal workings of the CRT and any other non-documented code at any time and without notice. Don't complain to them, or to me, if at some time in the future the _initterm trick will stop working! However, the general idea of building tables at link-time via segment reordering is probably usable for many years to come.

Addendum: int Instead of void

During the editing of this article, I found out that newer versions of crt0dat.c (including the one found in the latest Platform SDK update of February, 2003) have an _initterm_e as well as an _initterm. This version of the CRT declares the functions in the ".CRT$XI?" segments as returning int instead of void (all the other functions are still expected to return void). The _initterm_e function checks if the return value from all the functions is zero. Later on, a message is displayed on stderr or in a MessageBox if there was a CRT component that failed to initialize. Interestingly, the stdio initialization function can fail too; I wonder if that error would ever show up on stderr.

To make sure that your code can be used by as many versions of the CRT as possible, you should declare your functions as returning int, and you should return 0, indicating "no error." If you want to, you can dig into the CRT sources to find which error codes you can use, but it's probably safer to just assume that all nonzero return values are reserved for CRT usage and should not be used. Depending on which CRT version your program will be linked with, the 0 return value will either be ignored, or checked and approved, so this will work on any CRT version known to me. Just to stay safe, I declared all 'U' segment functions as returning int, not just the ones in ".CRT$XIU", so as to minimize the chance that future CRT versions will create havoc by expecting a meaningful return value.

Comments

There are no comments yet. Be the first to comment!

You must have javascript enabled in order to post comments.

Leave a Comment

Your email address will not be published. All fields are required.

Name

Email

Title

Comment

Top White Papers and Webcasts

Live Event Date: March 19, 2015 @ 1:00 p.m. ET / 10:00 a.m. PT
The 2015 Enterprise Mobile Application Survey asked 250 mobility professionals what their biggest mobile challenges are, how many employees they are equipping with mobile apps, and their methods for driving value with mobility.
Join Dan Woods, Editor and CTO of CITO Research, and Alan Murray, SVP of Products at Apperian, as they break down the results of this survey and discuss how enterprises are using mobile application management and private …

On-demand Event
Event Date: February 12, 2015
The evolution of systems engineering with the SysML modeling language has resulted in improved requirements specification, better architectural definition, and better hand-off to downstream engineering. Agile methods have proven successful in the software domain, but how can these methods be applied to systems engineering? Check out this webcast and join Bruce Powel Douglass, author of Real-Time Agility, as he discusses how agile methods have had a tremendous …