Difference between Generics in C#, Java and C++

Why were the generics introduced ?

The generics are yet another mechanism to reduce code duplication. They allow us to specify a class only once, and use it later with different parameters, maintaining one codebase. A main example would be the data structures in C# and Java. Nowadays, most of the data structure implementations are generic, meaning that you are not required to do any boxing/casting. They just work and they are strongly typed, given all the sugar syntax you would expect from it.

But have you ever wondered why your favourite Java IDE just can’t give you enough debug information when working with generic lists ? This article is about the differences in generic’s implementation in C#, Java and C++ (where they are called templates, actually). Keep reading and you’ll figure out why.

How do the generics work in C# ?

On the surface, the generic’s implementation in C#, Java and C++ is quite the same. Under the hood, however, there are few substantial differences that deserve attention.

When you declare a generic class with a primitive parameter, the MSIL code produced is not much different than what you had written. At runtime, however, the JIT compiler actually compiles every use of the generic class into a separate native code. For example if you use List<int> and List<float>, a different native versions will be generated for each of them. Int32 and Single are structures, value types. But note that if we have used reference types, a large part of the codebase would have been shared, because of the use of reference-based types. There is no boxing/unboxing, no casts.

Another fact is that when you create a generic method with parameters, you are not allowed to call operations other than the ones defined in Object, because they are the only one the compiler could guarantee to exist. You are not allowed to call any operators that are actually not implemented in the type you pass, a compile-time error will be thrown. If you want to call specific operations on a parameter, you need to specify the interface they implement. Generics in C# are really strongly typed and safe.

And all of that is perfectly normal, you say. All of it makes sense, nothing new. Yes, but have you ever worked with Java generics or C++ templates ? Let’s check how these mechanisms are implemented there.

Generics implementation in Java

As I said, on the surface there is almost no difference between the generics in C# and the ones in Java. Under the hood, however, things work the other way around.

So, what do you think happens when you write ArrayList<int> in Java. A new class is generated with the right parameter ? Nope. The compiler takes away the type parameter and substitutes it with Object everywhere, erasing all type information. And that is why, if you have ever wondered, you don’t have full debug information for generic lists in your favourite IDE. But the compiler includes all the casts for your, effectively performing boxing/unboxing every time you access your list.

So you get the syntactic sugar, but you don’t get any performance out of it.

What about templates in C++ ?

The good news is that in C++ you also get separate classes for every parameterized use of a template class. And that’s cool from a performance point of view.

Number one difference between generics in C# and templates in C++, is that in C# the native images for the classes are generated at runtime, while in C++ they are generated compile-time. And that’s perfectly normal, you don’t have intermediate language in C++. You don’t have reflection.

The second big difference is with constraints. When working with templates, you generally get much more freedom compared to C# and Java. Consider the following example :

1

2

3

4

5

6

7

template<classT>TAddition(Toperand1,Toperand2){

Tresult;

result=operand1+operand2;

returnresult;

}

This will compile successfully, because there is no way for the compiler to know any additional information regarding the intent, only the sketch is presented. The actual check will be performed later, when an actual object is created from the template (still compile time). And even then, the only thing that will matter is if the + operator is defined. Whereas in C# and Java you can specify the actual super class or interface that declares the operations (explicit constrains), making the compiler able to check the actual class instead of simply matching the function names :

1

2

3

4

classEmployeeList<T>whereT:Employee,IEmployee

{

// ...

}

And this will actually throw an error if you attempt to use any operation (method) that is not defined on T, in the declaration itself. Another con with the templates is that the errors thrown are usually quite obtrusive and hard to read, aside from exposing the class internals instead of simply showing the constraint message.

Summary

I believe the differences between the generic implementations in C#, Java and C++ are quite interesting, so I hope you learned something new today. To summarize the general points of the article :

Generics in C#

are generated at runtime

create native codebase per generic parameter

support explicit constraints

Generics in Java

are generated at runtime

use Object substitution and casting instead of multiple native versions of the class

support explicit constraints

Templates in C++

Hi there ! My name is Kosta Hristov and I currently live in London, England. I've been working as a software engineer for the past 6 years on different mobile, desktop and web IT projects. I started this blog almost one year ago with the idea of helping developers from all around the world in their day to day programming tasks, sharing knowledge on various topics. If you find my articles interesting and you want to know more about me, feel free to contact me via the social links below. ;)

"For example, you can add or subtract two parameters that don’t have their + operators predefined."Wouldn't that generate a compiler error in c++? And by using a variable of a templated type in a context where a specific base type is expected (assignment, function parameter, …) you would enforce a compile-time constraint on the type.
Other fun stuff in c++: Templated parameters can also be integers and function pointers.

1. "At runtime, however, the JIT compiler actually compiles every use of the generic class into a separate native code." – ??
As far as I know, this is not true, or maybe only partially true – for basic primitive types like int, float, …
If I remember correctly from the articles on MSDN, only 1 version of native code is held & executed in memory. To be honest, I'm not sure. ;)

2. "Generics in Java: are generated at runtime" – ??!?
I don't know about new versions, but before Java 7 they were generated in compile time. That is the one of the reasons why you can't make true generic library in Java (without boxing), but you can make it in C#. AFAIK Java uses "Type erasure" which equals rewriting generic code into raw types during compilation.

It depends whether the template is actually used in the compilation unit. There is a check that is performed compile-time, but it is only at the exact point of instantiation, not at declaration time (since there is no way to infer this information at that point). Also, the compiler only checks if operations/methods with the same name are presented in the class specified as template parameter, while in Java and C# the actual class is checked for compliance (through the use of explicit constraints).

I've edited the article a little bit so it can be more accurate and descriptive.

It actually depends on the generic parameter, as far as I know for reference types a quite large portion of the native codebase is reused, whilst for value types like int and float (in C#) a whole different version of the class is created. The reference types are actually simpler than the value types, since the reference itself is held in the stack and is usually long the size of the machine word (let's say, 32 bits).

Regarding your second remark, the generics are actually generated at runtime. :) That's because the JIT compiler produces the binary code once an assembly is started (unless NGen is used). But I know what you mean, and you are right.

Thanks for the comments ! :) If you have more remarks, don’t hesitate to write them down. ;)