40 years of C

Via ronebofh I was reminded of the 40th birthday of Unix. But that means it's also the 40th birthday, more or less, of the C programming language.

C's neither the oldest programming languagestill in use, nor necessarily even the one used most often. But it does occupy a dominant spot as the language of choice for systems programming. "Systems" is admittedly not a bright-line distinction, but if you're writing an operating system, embedded firmware, a networking device, or a database--- most likely C is going to be used.

There are some exceptions. Forth still has its adherents in the embedded space, and there may still be people churning out assembly for microcontrollers (although stripped-down C goes a long way now.) A lot of modern NoSQL systems are written in Java or Erlang. Microsoft used C++ for a lot of Windows (and then C# once they'd built that), and Objective-C has had a resurgence thanks to Apple, although the kernels of both are still C. So what makes C so ubiquitous? Why does the industry still depend on a 40-year old set of programming conventions? Some theories:

C got it fundamentally right and no improvement is possible. A choir of angels sang, divine inspiration struck, and K&R wrote down the received gospel. They don't call programming language wars "religious" for nothing. Implicit here is the idea that we really haven't learned anything in 40 years that would make us reconsider the basic assumptions of C.

C operates at the "thin layer over assembly" necessary for systems programming. If you're writing low-level crud like device drivers or interrupt handlers, you don't want a lot of runtime infrastructure getting in the way. You also want to be able to understand and control how the higher-level constructs gets expressed as machine language. There is not a fundamental advancement of any sort to be had, so any benefit from a new language would be marginal. C is a power tool, and how much power tool evolution have you seen recently?

C permits attention to data placement and movement in a way that successor languages do not. Modern CPUs try very hard to present the illusion that all accesses are equally fast, all code paths execute equally well, and data placement doesn't really matter. This is bullshit, because in order to get the most efficiency out of a CPU you need to be cache-aware, and even think a bit about ILP.

C is like QWERTY. Everybody acknowledges that it might be possible to do better, but the cost of changing is too large. It's good enough to work with and every generation's fingers get trained in the same way.

C is like English. Everywhere you go, you're likely to find somebody who speaks it. It readily adopts words from other languages. There's no need to learn Esperanto (Java/C#/Python) to communicate, because there's already a language out there that works. You'll probably have to talk to C at some point anyway. Plus, who wants to translate the large body of existing literature?

Using C is like buying EMC (or, in a previous decade, IBM). Nobody ever got fired for sticking with the standard. You know that the C infrastructure will still be there in a decade, and you can continue to hire people who know it.

Language evolution has switched to application programming. Systems programming gets less important. People can build large-scale, reliable systems in other programming languages, so there's no need to innovate in the "systems programming" area. Besides, nobody is really writing a new operating system any more. Guido, Larry, and Rasumus had a lot more influence by helping people write web applications instead.

C is symbiotic with Unix, and now we run Unix everywhere.

Modern processors are optimized to run C. When benchmarking CPU designs, the programs that get used are mainly C programs. Thus any competing language starts out at a disadvantage if its instruction mix turns out dissimilar.

Programming language researchers aren't interested in real-world problems. Jonathan Shapiro says: "By the time I left the PL community in 1990, respect for engineering and pragmatics was fast fading, and today it is all but gone. The concrete syntax of Standard ML and Haskell are every bit as bad as C++. It is a curious measure of the programming language community that nobody cares. In our pursuit of type theory and semantics, we seem to have forgotten pragmatics and usability." There might be something to this: how many academic languages have taken off, vs. "scripting" languages invented to get a particular job done?

C has evolved and will continue to do so. Modern C is to K&R as Wodehouse is to Shakespeare. Yes, English-speakers can read both, but pretending that the language is the same misses the point. Because C does adapt, no competitor can move in to fill its niche.

Nobody will be abandoning C any time soon, whatever the reason. But, I think that there is now much more acceptance that a programmer will learn, and work with, multiple languages than ever before. When I spoke at Gustavus recently, one student asked about the languages Tintri was using: I quickly rattled off C++, Java, Python, C, Bash, and even a few bits of Tcl. A web developer will constantly be switching between JavaScript or Flex and some selection of back-end languages (Java, Ruby, Python, PHP, Perl...)

There is also a broader community that has experience implementing and participating in the design of practical programming languages and language features. But I think attempts to improve system programming can't harness these trends without first understanding the forces that led to C's 40-year run.