15-312 Principles of Programming Languages

Why Study Programming Languages?

This is a course on the principles of programming languages. Why
study these principles? Because they are fundamental to the
design, implementation, and application of
programming languages.

Programming language design is often regarded as largely, or even
entirely, a matter of opinion, with few, if any, organizing principles, and no
generally accepted facts. Dozens of languages are in everyday use in research
laboratories and in industry, each with its adherents and detractors. The
relative merits of languages are debated endlessly, but always, it seems, with
an inconclusive outcome. Some would even suggest that all languages are
equivalent, the only difference being a matter of personal taste. Yet it is
obvious that programming languages do matter!

Yet can we really say that Java is ``better'' (or ``worse'') than C++? Is
Scheme ``better'' than Lisp? Is ML ``better'' than either of them? Can we
hope to give substance to any of these questions? Or should we simply
reserve them for late night bull sessions over a glass of beer? While
there is certainly an irreducible subjective element in programming
language design, there is also a rigorous scientific theory of
programming languages that provides a framework for posing, and sometimes
answering, such questions. To be sure there are good questions for which
current theory offers no solutions, but surprisingly many issues are
amenable to a rigorous analysis, providing definite answers to many
questions. Programming language theory liberates us from the tar pit of
personal opinion, and elevates us to the level of respectable scientific
discourse.

Programming language theory is fundamental to the implementation
of programming languages, as well as their design. While compiler writers
have long drawn on the theory of grammars for parsing and on graph theory
for register allocation, the methods used to compile well-known languages
such as C do not rely on deep results from programming language theory.
For relatively simple languages, relatively simple compilation methods
suffice. But as languages become more sophisticated, so must more
sophisticated methods be employed to compile them.

For example, some programs can be made substantially more efficient if
code generation is deferred until some run-time data is available. A
tight inner loop might be ``unrolled'' into a linear instruction sequence
once the iteration bound is determined. This is one example of partial
evaluation, a technique for program specialization that rests on results
from programming language theory. To take another example, modern
languages such as ML (and proposed extensions of Java) include what are
known as parameterized types to support flexible code
re-use. Parameterized types complicate compilers considerably because
they must account for situations in which the type of a variable or
function argument is not known at compile time. The most effective
methods for handling parameterized types rely on typed intermediate
languages with quite sophisticated type systems. Here again programming
language theory provides the foundation for building such compilers.

Programming language theory has many applications to programming
practice. For example, ``little languages'' arise frequently in software
systems --- command languages, scripting languages, configuration files,
mark-up languages, and so on. All too often the basic principles of
programming languages are neglected in their design, with all too
familiar results. After all, the argument goes, these are ``just''
scripting languages, or ``just'' mark-up languages, why bother too much
about them? One reason is that what starts out as ``just'' an ad
hoc little language often grows into much more than that, to the
point that it is, or ought to be, a fully-fledged language in its own
right. Programming language theory can serve as a guide to the design
and implementation of special purpose, as well as general purpose,
languages.

Another application of the theory of programming languages is to provide
a rigorous foundation for software engineering. Formal methods for
software engineering are grounded in the theory of specification and
verification. A specification is a logical formula describing the
intended behavior of a program. There are all kinds of specifications,
ranging from simple typing conditions (``the result is a floating point
number between 0 and 1'') to complex invariants governing shared
variables in a concurrent program. Verification is the process of
checking that the implementation indeed satisfies the specification. Much
work has gone into the development of tools for specifying and verifying
programs. Programming language theory makes precise the connection
between the code and its specification, and provides the basis for
constructing tools for program analysis.

The theory of programming languages provides a ``reality check'' on
programming methodology, that part of software engineering concerned with
the codification of successful approaches to software development. For
example, the merits of object-oriented programming for software
development are well known and widely touted. Object-oriented methodology
relies heavily the notions of subtyping and inheritance. In many accounts
these two notions are confused, or even conflated into one concept,
apparently because both are concerned with the idea of one class being an
enrichment of another. But careful analysis reveals that the two concepts
are, and must be, distinct: confusing them leads to programs that violate
abstraction boundaries or even incur run-time faults.

The purpose of the this course is to introduce the basic
principles, methods, and results of programming languages to undergraduate students who
have completed the introductory sequence in computer science at Carnegie Mellon.
I intend for students to develop an appreciation for the
benefits (and limitations) of the rigorous analysis of programming
concepts.