The Process of Compilation in C++, Java and .Net

The process of compilation in an unmanaged environment (C, C++)

Pure C and C++ programs usually run in an unmanaged environment. Classes (if present at all) are separated into headers (files with *.h extension, the declaration of the class) and implementations (files with .cpp extension, the definition of the class). Every class forms a separate compilation unit. When you build your project, there are few base components that translate your source into binary code. Let's see the following diagram (taken from ntu.edu.sg)

First comes the preprocessor, which is responsible for the inclusion of the header files and the translation of any macro expressions, if presented.

Next is the compiler, which does not produce binary code directly. It produces assembly code (.s), which is basically the unstructured common base all modern languages come from.

After that the assembler translates the assembly code into object code (.o and .obj files, machine instructions).

Finally, the linkertakes all object files and produces an executable or a library, along with combining them with other similar files.

The bottom line is that in the general case your code is first translated to assembly code, and then translated to machine instructions – binary code.

What are the macros ?

The macros in C and C++ are preprocessor directives that allow the use of a number of techniques, including predefinition and creation of keywords, conditional compilation, et cetera. In general, their usage in C++ is considered bad practice mainly due to the possible abuse of syntax changes that they offer – it’s even possible to create your own language in a sense aside of any standards. Macros also don’t obey the normal rules of source compilation, simply because they are processed by the preprocessor, not the compiler.

The process of compilation in a managed environment (.Net, Java)

The process of compilation in managed environments is slightly different. Let's take .Net and Java as examples. The code we write in our favourite IDE is first checked by the IDE itself. Then, when compiled into object files and linked into dynamic/static libraries or executables, it is checked again, and finally it is checked at runtime. One common characteristic of the managed environments is that the compiler does not produce binary code, but rather intermediate meta-code, called MSIL - Microsoft Intermediate Language in .Net and Bytecodein Java.

After that, the MSIL is translated to binary code at runtime by the JIT (Just In Time) compiler, meaning that the code you write is only interpreted when it is actually used. This allows the CLR (Common Language Runtime) to precompile and optimize your code, achieving improved performance, at the cost of increased startup time. But you can also precompile your application using Ngen (Native Image Generator) to speed up the startup, without the benefits of runtime optimization.

The .Net framework is just an abstraction on top of the Win32 core, which provides certain benefits like multi-language support, just-in-time optimizations, automatic memory management and improved security to a certain point; the alternative to which we now see in WinRT - a complete custom solution. But that's another topic.

Pros and Cons of the Just In Time compiler

The JIT compilation comes with a number of benefits, the biggest one in my opinion is the performance benefits you get. The just in time compilation allows the CLR (Common Language Runtime, playing the role of the Assembler component) to execute only the code that is necessary. For example, if you have a really big WPF application, it will not be loaded all at once. Instead, the CLR will start executing different portions of your code translating them to native instructions in a very efficient way, because it is able to inspect the system “just in time” and produce an optimized code, rather than following a certain predefined pattern. Unfortunately, one drawback is that startup process is slower, meaning that it’s not good for use of packages that require a lot of time to load.

Using NGen as an alternative to the Just In Time compiler in .Net

If Visual Studio was created using JIT, you would have needed to wait few minutes for it to start. Instead, it is compiled using the NGen (Native Image Generator), which creates pure old-fashioned binary executable. Of course, you don’t get any of the benefits of JIT, but this is definitely the right choce when it comes to speed at startup.
​

Hi there ! My name is Kosta Hristov and I currently live in London, England. I've been working as a software engineer for the past 6 years on different mobile, desktop and web IT projects. I started this blog almost one year ago with the idea of helping developers from all around the world in their day to day programming tasks, sharing knowledge on various topics. If you find my articles interesting and you want to know more about me, feel free to contact me via the social links below. ;)

“The JIT compilation comes with a number of benefits, the biggest one in my opinion is the performance benefits you get.” — Surely you can’t be saying that you get performance benefits over C/C++ ?

Use of macros in C++ is not considered bad practice, in fact, it is common practice. For example, the #ifndef/#define/#endif macros are seen in almost every header to file to protect from multiple inclusions.

The MSIL is compiled and assembled before it’s executed, after that it is loaded and cached. Therefore it shouldn’t be slower. It’s probably not faster than C++, of course, but you get the same speed (runtime) which is a performance benefit.

The use of macros in C++ is considered a bad practice. The use of #ifdef etc is an exception, but that’s just a simple directive.