The Semicolon Wars

I mock the pettiness of these squabbles—and I believe some of them deserve mocking—and yet I don't want to give the impression that only cosmetic issues are in dispute, or that programming languages are really all alike under the skin. On the contrary, what's most fascinating about programming languages is how dramatically they differ. I would argue that the distance between C and Lisp, for example, is greater than that between any pair of human languages.

Noam Chomsky asserts that all human languages have the same "deep structure," which may even be hard-wired into the brain. In computer languages, too, certain features seem to be universal. Almost all programming languages are built on the same kind of grammatical scaffold, called a context-free grammar. At the semantic level, almost all programming languages have the same computational power: If you can compute something in one language, you can get the same answer in any other, given enough effort. But this formal equivalence is misleading. Raw computational power is not what people care about in a programming language; the real criterion is how readily you can express your ideas.

In the 1930s the linguists Edward Sapir and Benjamin Lee Whorf argued that what you can think is conditioned by what language you think in. For natural languages, the Sapir-Whorf hypothesis has met with much skepticism, but for computer languages the idea seems more plausible. Different categories of programming languages elicit quite different modes of thinking and problem solving.

Programming languages are usually classified in four families. Imperative languages are built on commands: do this, do that, do the next thing. The commands act on stored data, modifying the overall state of the system. The imperative approach was the default in most early programming languages, including Fortran, cobol and Algol.

A functional language is modeled on the idea of a mathematical function, such as f(x)=x2. The function is a black box that accepts arguments as input and returns values as output. A key point is that the calculation depends only on the arguments and affects only the value; there are no extraneous side effects. This property makes it easier to reason about functional programs, since there's no need to keep track of the state of the entire machine. Functional programming began with Lisp, although most versions of Lisp allow other styles of programming as well. John Backus, the lead developer of Fortran and a contributor to Algol, later became an advocate of functional languages. Several "pure" functional languages have emerged since then, including ML, Miranda and Haskell.

In object-oriented programming languages the root idea is to bind together imperative commands and the data they act on, forming encapsulated objects. Instead of defining a procedure to manipulate a data structure, one "teaches" the data structure how to carry out operations on itself. Most object-oriented languages also have some notion of inheritance, whereby an object is born already knowing default behaviors. The object-oriented languages trace their heritage back to SIMULA 67, but they began to attract attention only in the 1980s with Smalltalk. In a curious turn of events, object-oriented principles became wildly popular, but the result was not the widespread adoption of Smalltalk; instead, object-oriented features were bolted onto other languages. From C, for example, came C++ and Objective C and eventually C#; Java is also in this family. Object-oriented notions are now so deeply ingrained that they influence almost every new language.

The languages of the fourth category are variously known as logic, relational or declarative languages. What they have in common is the idea of programming not by spelling out step-by-step algorithms but by stating facts or relations. The best-known exemplar of this technique is Prolog, which relies on an method called unification to make deductions from stated facts. Related concepts also turn up in less-exotic areas such as database-query languages and spreadsheets.

These four categories suggest the breadth of the programming-language spectrum, but there are further variations across many other dimensions. At the most superficial level, the various languages simply look different. C is terse, cobol quite verbose. Lisp is full of parentheses. Perl, said some wag, looks like Snoopy swearing: @&$^^#@!.

Languages can also be distinguished as "low-level" or "high-level." The low-level ones allow more-direct access to aspects of the underlying hardware, such as addresses in memory or input and output devices. High-level languages provide an insulating layer of abstraction.

A generation of languages created in the 1970s emphasized "structured programming"—otherwise known as bondage and discipline. Pascal is in this group: It enforces strict rules about types of data and the flow of control through a program. The reaction against such constraints produced "hacker-friendly" languages, including C.

Languages also differ in their intended audience or area of application. Fortran began as a language for scientific computing, COBOL for business. Quite a few interesting languages were designed for teaching or for children. BASIC, Pascal and Smalltalk are all in this class, and so is Logo. (All of them have had to struggle to be taken seriously as languages for grownups.)