Anders Hejlsberg, a distinguished engineer at Microsoft, led the team that
designed the C# (pronounced C Sharp) programming language. Hejlsberg first vaulted onto the
software world stage in the early eighties by creating a Pascal compiler for MS-DOS and CP/M.
A very young company called Borland soon hired Hejlsberg and bought his compiler, which was
thereafter marketed as Turbo Pascal.
At Borland, Hejlsberg continued to develop Turbo Pascal and eventually led the team
that designed Turbo Pascal's replacement: Delphi. In 1996, after 13 years with Borland, Hejlsberg
joined Microsoft, where he initially worked as an architect of Visual J++ and the Windows Foundation
Classes (WFC). Subsequently, Hejlsberg was chief designer of C# and a key participant in the
creation of the .NET framework. Currently, Anders Hejlsberg leads the continued development
of the C# programming language.

On July 30, 2003, Bruce Eckel, author of Thinking in C++ and Thinking in Java,
and Bill Venners, editor-in-chief of Artima.com, met with Anders Hejlsberg in his office at Microsoft
in Redmond, Washington. In this interview, which is being published in multiple installments on Artima.com
and on an audio CD-ROM to be released by Bruce Eckel,
Anders Hejlsberg discusses many design choices of the C# language and the .NET framework.

In Part I: The C# Design Process, Hejlsberg discusses
the process used by the team that designed C#, and the relative merits of usability
studies and good taste in language design.

In Part VI: Inappropriate Abstractions, Hejlsberg and other members of the C# team discuss
the trouble with distributed systems infrastructures that attempt to make the
network transparent, and object-relational mappings that attempt to make the
database invisible.

In Part VII: Generics in C#, Java, and C++, Hejlsberg compares C#'s generics implementation to
Java generics and C++ templates, describes constraints in C# generics, and describes
typing as a dial.

Interpreting and Adaptive Optimizations

Bill Venners: One difference between Java bytecodes and IL [Intermediate Language] is that Java bytecodes
have type information embedded in the instructions, and IL does not. For example, Java has
several add instructions: iadd adds two ints, ladd
adds two longs, fadd adds two floats, and and
dadd adds two doubles. IL has add to add two
numbers, add.ovf to add two numbers and trap signed overflow, and
add.ovf.un to add two numbers and trap unsigned overflow. All of these
instructions pop two values off the top of the stack, add them, and push the result back. But
in the case of Java, the instruction indicates the type of the operands. A fadd
means two floats are sitting on the top of the stack. A ladd
means there two longs are sitting on the top of the stack. By contrast, the
CLR's [Common Language Runtime] add instructions are polymorphic, they add the two values on the top of
the stack, whatever their type, although the trap overflow versions differentiate between
signed and unsigned. Basically, the engine running IL code must keep track of the types of
the values on the stack, so when it encounters an add, it knows which kind of
addition to perform.

I read that Microsoft decided that IL will always be compiled, never interpreted.
How does encoding type information in instructions help interpreters run more
efficiently?

Anders Hejlsberg: If an interpreter can just blindly do what the instructions say
without needing to track what's at the top of the stack, it can go faster. When it sees an
iadd, for example, the interpreter doesn't first have to figure out which kind of
add it is, it knows it's an integer add. Assuming someone has already verified that the stack
looks correct, it's safe to cut some time there, and you care about that for an interpreter. In
our case, though, we never intended to target an interpreted scenario with the CLR. We
intended to always JIT [Just-in-time compile], and for the purposes of the JIT, we needed to track the type
information anyway. Since we already have the type information, it doesn't actually buy us
anything to put it in the instructions.

Bill Venners: Many modern JVMs [Java virtual machines] do adaptive optimization, where they start by
interpreting bytecodes. They profile the app as it runs to find the 10% to 20% of the code
that is executed 80% to 90% of the time, then they compile that to native. They don't necessarily
just-in-time compile those bytecodes, though. A method's bytecodes can still be executed by
the interpreter as they are being compiled to native and optimized in the background. When
native code is ready, it can replace the bytecodes. By not targeting an
interpreted scenario, have you completely ruled out that approach to execution in a CLR?

Anders Hejlsberg: No, we haven't completely ruled that out. We can still interpret.
We're just not optimized for interpreting. We're not optimized for writing that highest
performance interpreter that will only ever interpret. I don't think anyone does that
anymore. For a set top box 10 years ago, that might have been interesting. But it's no longer
interesting. JIT technologies have gotten to the point where you can have multiple
possible JIT strategies. You can even imagine using a fast JIT that just rips quickly, and
then when we discover that we're executing a particular method all the time, using another
JIT that spends a little more time and does a better job of optimizing.
There's so much more you can do JIT-wise.

Bill Venners: When I asked you earlier (In Part IV) about
why non-virtual methods are the default
in C#, one of your reasons was performance. You said:

We can observe that as people write code in Java, they forget to mark their methods final.
Therefore, those methods are virtual. Because they're virtual, they don't perform as well.
There's just performance overhead associated with being a virtual method.

Another thing that happens in the adaptive optimizing JVMs is they'll inline virtual method
invocations, because a lot of times only one or two implementations are actually being used.

Anders Hejlsberg: They can never inline a virtual method invocation.

Bill Venners: My understanding is that these JVM's first check if the type of the object
on which a virtual method call is about to be made is the same as the one or two they expect,
and if so, they can just plow on ahead through the inlined code.

Anders Hejlsberg: Oh, yes. You can optimize for the case you saw last time and check
whether it is the same as the last one, and then you just jump straight there. But there's
always some overhead, though you can bring the overhead down to fairly minimum.