An Interview with Anders Hejlsberg: C# Language and Design

When Microsoft settled a lawsuit from Sun Microsystems over changes to the Java programming language, they turned to veteran language designer Anders Hejlsberg to design a new object-oriented language backed by a powerful virtual machine. The result was C#--and a replacement for both Visual C++ and Visual Basic within the Microsoft ecosystem. Although comparisons to Java are still inevitable in syntax, implementation, and semantics, the language itself has evolved past its roots, absorbing features from functional languages such as Haskell and ML.

Language and Design

You've created and maintained several languages. You started as an implementer of Turbo Pascal; is there a natural progression from implementor to designer?

Anders Hejlsberg: I think it's a very natural progression. The first compiler I wrote was for a subset of Pascal, and then Turbo Pascal was the first almost full implementation of Pascal. But Pascal was always meant as a teaching language and lacked a bunch of pretty common features that are necessary to write real word apps. In order to be commercially viable, we immediately had to dabble in extending in a variety of ways.

It's surprising that a teaching language would be so successful in bridging the gap between teaching and commercial success.

Anders: There are many different teaching languages. If you look at Niklaus Wirth's history--Niklaus Wirth designed Pascal, later Modula and Oberon--he always valued simplicity. Teaching languages can be teaching languages because they're good at teaching a particular concept, but they're not really real other than that; or they can be full-fledged languages that truly teach you the basics of programming. That was always Pascal's intent.

There seem to be two schools of thought. Some schools--MIT, for example--start with Scheme. Other schools seem to take a "practical" focus. For a while, they taught C++. Now it's Java, and some use C#. What would you do?

Anders: I've certainly always been in the more practical camp. I'm an engineer more than I'm a scientist, if you will. It's my belief that if you teach people something, teach them something they can use later for something practical.

Like always, the answer is not at the extreme. It's somewhere in between. Continually in the programming language practice, in the implementation of programming languages for the industry, we borrow from academia. Right now, we're seeing a great harvesting of ideas from functional programming which has been going on in academia for God knows how long. I think the magic here is you've got to do both.

Is your language-design philosophy to take ideas from where you can and make them practical?

Anders: Well, in a sense. I think you probably have to start with some guiding principles. Simplicity is always a good guiding principle. Also, I'm a great fan of evolving as opposed to just starting out new.

You might fall in love with one particular idea, and then in order to implement it, you go create a brand-new language that's great at this new thing. Then, by the way, the 90% that every language must have, it kind of sucks at. There's just so much of that, whereas if you can evolve an existing language--for example, with C# most recently we've really evolved it a lot toward functional programming--it's all gravy at that point, I feel. You have a huge user base that gets to just pick up on this stuff. There's a bit of a complexity tax, but it is certainly much less than having to learn a whole new language and a whole new execution environment in order to pick up a particular style of programming.

It's hard to draw the line between a language per se and its ecosystem.

Anders: Well, yeah, and certainly these days more and more. The language used to dominate your learning curve, if you go back say 20, 30 years. Learning a programming environment was all about learning the language. Then the language had a little runtime library. The OS had maybe a few things, if you could even get to the OS. Now you look at these gigantic frameworks that we have like .NET or Java, and these programming environments are so dominated by the sheer size of the framework APIs that the language itself is almost an afterthought. It's not entirely true, but it's certainly much more about the environment than it is about the language and its syntax.

Does that make the job of the library designer more important?

Anders: The platform designer's job becomes very important because where you really get maximum leverage here is if you can ensure longevity of the platform and the ability to implement multiple different languages on top of the platform, which is something that we've always put a lot of value in. .NET is engineered from the beginning as a multilanguage platform, and you see it now hosting all sorts of different languages on it--static languages, dynamic languages, functional languages, declarative languages like XAML, and what have you. Yet, underneath it all is the same framework, the same APIs, and the leverage there is just tremendous. If these were all autonomous silos, you'd just die a slow death in interop and resource consumption.

Do you favor a polyglot virtual machine in general?

Anders: I think it has to be that way. The way I look at it is, you go back to the good old 8-bit days, where you had 64K of memory. It was all about filling those 64K, and that happened pretty quickly. It's not like you were going to build systems for years there.

You could implement for a month or two and then that was that; 640K, maybe six months and you'd filled it up. Now it's basically a bottomless pit. Users demand more and more, and there's no way we can rewrite it all. It's about leveraging and making things that exist interoperate. Otherwise, you're just forever in this treadmill, just trying to do the basics.

If you can put a common substrate below it all and get much higher degree of interoperability and efficiencies out of shared system services, then it's the way to go. Take interoperability between managed code and unmanaged code, for example. There are all sorts of challenges there. But better we solve it than every distinct programming environment trying to solve it. The most challenging kinds of apps to build are these hybrid apps where half of the app is managed and the other half is unmanaged, and you have garbage collection on one side of the fence and none on the other.

There seems to be a design goal in the JVM never to break backward compatibility with earlier versions of the bytecode. That limits certain design decisions they can make. They can make a design decision at the language level, but in the actual implementation of generics, for example, they have to do type erasure.

Anders: You know what? I think their design goal wasn't just to be backward compatible. You could add new bytecodes and still be backward compatible. Their design goal was to not do anything to the bytecode, to the VM at all. That is very different. Effectively, the design goal was no evolution. That totally limits you. In .NET, we had the backward compatibility design goal, so we added new capabilities, new metadata information. A few new instructions, new libraries, and so forth, but every .NET 1.0 API continued to run on .NET 2.0.

It's always puzzled me that they chose that path. I can understand how that gets you there right now on what's there, but if you look at the history of this industry, it's all about evolving. The minute you stop evolving, you've signed your own death sentence. It's just a matter of time.

Our choice to do reified generics versus erasure is one that I am supremely comfortable with, and it is paying off in spades. All of the work we did with LINQ would simply not be possible, I would argue, without reified generics. All of the dynamic stuff that we do in ASP.NET, all of the dynamic code generation we do in practically every product that we ship so deeply benefits from the fact that generics are truly represented at runtime and that there is symmetry between the compile time and runtime environment. That is just so important.

If you're going to break it, then break it good. Break everything. Get to the very front of the line. Don't like move up a couple of slots. That's pointless.--Anders Hejlsberg

One of the criticisms of Delphi was that there was a strong reluctance to break code, which informed some language decisions.

Anders: Let's step back then. When you say break code, that must first of all mean that you're talking about an evolution of something. You're talking about a version N + 1 of something. You could argue that sometimes it's good to break code, but by and large, when you sum it up, I've never been able to justify breakage. The only argument I hear for breakage, because they're not really good arguments, is "It's cleaner that way" or "It's architecturally more sound" or "It'll prepare us better for the future" or whatever. I go, "Well, you know, platforms live maybe 10, 15 years and then they cave in under their own weight, one way or the other."

They become more or less legacy, maybe 20 years. At that point, there's enough new around them and enough new without any overhead. If you're going to break it, then break it good. Break everything. Get to the very front of the line. Don't like move up a couple of slots. That's pointless.

That sounds like a game of leapfrog where the turns take 5 or 10 years.

Anders: You either play leapfrog or you be super cognizant of backward compatibility, and you bring your entire community with you every time.

Managed code does that to some degree. You can use your existing components in process.

Anders: Certainly from the inception of .NET we have remained backward compatible at every release. We fix some bugs that caused some code to break, but I mean there has to be some definition by which it is okay to break people's code.

In the name of security or in the name of correct program behavior or whatever, yes, we will sometimes break, but it is rare, and generally it reveals a design error in the user's program or something that they're actually glad to have fixed because they weren't aware that that was a problem. It's good at that point, but gratuitous breakage in the name of more beautiful code or whatever, I think it is a mistake. I've done that enough in my early years to know that that just gets you nowhere with your customers.

It's hard to make the argument from just good taste.

Anders: Yeah. Well, sorry. My good taste is not your good taste.

If you look back on the languages you were involved in, from Turbo Pascal through Delphi, J++, Cool, and C#, are there themes in your work? I can listen to early Mozart and then to his Requiem, and say, "Those are both distinctly Mozart."

Anders: Everything is a picture of the time that you're in. I've grown up with object orientation and whatever. Certainly ever since the middle of Turbo Pascal up until now everything I've worked on has at the core been an object-oriented language. A lot of evolution happened there that has carried forward. In Delphi, we did a bunch of work on a more component-oriented programming model, with properties and events and so forth.

That carried forward into the work that I've done with C#, and certainly that's recognizable. I try to always keep a finger on the pulse of the community and try to be there with the relevant new. Well, Turbo Pascal was the innovative development environment, and Delphi was the visual programming--RAD. C# and .NET has all been about managed execution environments, type safety, and so forth. You learn from all of the stuff that's around you, be it in your ecosystem or competitive ecosystems. You really try to distill what is good about those, and what didn't work for them. In this business, we all stand on the shoulders of giants. It's fascinating actually how slowly programming languages evolve when you compare to the evolution that we've seen in hardware. It is astounding.

Since Smalltalk-80, we've had between 15 or 20 generations of hardware!

Anders: One every 18 months practically, and yet, there's not really a massive difference between the programming language we use today and those that were conceived, say, 30 years ago.

They're still arguing over old concepts such as higher-order functions in Java. That's probably going to be a 10-year debate.

Anders: Which is unfortunate, because I think they could move a bit faster on that one. I don't think there's really a question of whether it's valuable. It's more a question of whether there's too much process and overhead in the Java community to get it done.

If going to a continuation passing style and exposing "call with current continuation" at the language level gives you a huge advantage, would you do that, even if only 10% of programmers might ever understand it?

Anders: If, yes--but that's a big if. I don't think that that's the case, but look at what we did with LINQ. I truly believe that that will benefit the vast majority of our C# programmers. The ability to write more declarative styles of queries and have a uniformly applicable query language across different domains of data, it's super valuable. It's like the Holy Grail language and database integration in some ways. We may have not solved the entire problem there, but I think we made sufficient progress that it justifies the extra learning, and there are ways you can expose that to people without having them figure out the lambda calculus from first principles.

I think it's a great example of a practical application of functional programming. You can happily use it and never even know that you're doing functional programming, or that there are functional programming principles powering it underneath. I'm very happy with where we ended up on that one.

Certainly the best languages are designed by small groups of people, or single individuals. --Anders Hejlsberg

You used the word "practical." How do you decide which features to add and which features to exclude? What are your criteria for deciding what to add and what to keep out?

Anders: I don't know. Over time, you get a knack for telling whether this is going to benefit enough of your users to merit the conceptual baggage that it creates, right? Trust me, we see lots of interesting proposals from our user base of, "Oh, if we could only do this," or "I'd love to do that," but often it's too narrowly focused on solving one particular problem and adds little value as an abstract concept.

Certainly the best languages are designed by small groups of people, or single individuals.

Is there a difference between language design and library design?

Anders: Very much so. The APIs are obviously much more domain-specific than languages, and languages really are a level of abstraction above APIs if you will. Languages put in place the framework, the quarks and the atoms and the molecules, if you will, of API design. They dictate how you put together the APIs but not what the APIs do.

In that sense, I think there's a big difference. This actually gets me back to what I wanted to talk about before. Whenever we look at adding a new feature to the language, I always try to make it applicable in more than one domain. The hallmark of a good language feature is that you can use it in more than just one way.

Again, I'll use LINQ as an example here. If you break down the work we did with LINQ, it's actually about six or seven language features like extension methods and lambdas and type inference and so forth. You can then put them together and create a new kind of API. In particular, you can create these query engines implemented as APIs if you will, but the language features themselves are quite useful for all sorts of other things. People are using extension methods for all sorts of other interesting stuff. Local variable type inference is a very nice feature to have, and so forth.

We could've probably shipped something like LINQ much quicker if we said, "Let's just jam SQL in there or something that is totally SQL Server-specific, and we'll just talk to a database and then we'll have it," but it's not general enough to merit existence in a general-purpose programming language. You very quickly then become a domain-specific programming language, and you live and die by that domain.

You turn your nice 3GL into a 4GL, which is a general-purpose death.

Anders: Yeah. I'm very cognizant of that. Now one of the big things we're looking at is concurrency. Everybody's looking at concurrency because they have to. It's not a question of want to; it's a question of have to. Again, in the concurrency domain we could have the language dictate a particular model for concurrency--but it would be the wrong thing to do. We have to step above it and find what are the capabilities that are lacking in the language that would enable people to implement great libraries for concurrency and great programming models for concurrency. We somehow need treatment in the language to give us better state isolation. We need function purity. We need immutability as core concepts. If you can add those as core concepts, then we can leave it to the OS and framework designers to experiment with different models of concurrency because lo and behold, they all need these things. Then we don't have to guess at who will be the winner. Rather we can coast by when one blows up and it turns out that the other one was more successful.

We're still relevant.

It sounds like you want to give people tools to build great things, rather than dictating the kinds of things they're going to build.

Anders: I want to. You get much better leverage of community innovation that way.

Where do you see that in the C# community? Do people bring code to you? Do you go visit customers? Do you have your MVPs trolling newsgroups and user groups?

Anders: It's a mixture of all of the above plus some more. We have code-sharing things like Codeplex. There are all sorts of communities. There's commercial communities. There's open source. There's lots of open source .NET code. It's from all over. I don't think there is a single point of influx, so to speak. It's a varied and complex ecosystem out there.

You always run across stuff where you go, "Wow, how did they come up with this?" or "That's amazing." You can appreciate how much work this was for someone to do. It might not be commercially viable, but boy, it's a beautiful piece of work.

I certainly try to follow lots of blogs that are relevant to C# and LINQ.

Those are some of my favorite keywords when I go blog trolling, just to see what's happening out there. It gives you good insight in whether people are picking up on the work that you've done in the right way or not. It teaches you something for the future.