Programming Language Evolution

Apr 9, 2003

What if human languages didn’t evolve as they do, but instead, all
language change was centralized and controlled by expert linguists?
Would we be able to communicate with even half the level of
expressiveness and creativity that we do now? I don’t know of any
natural languages whose complete vocabularies and rule-sets have been
successfully codified and documented. Sure, we have dictionaries and
grammar books, but ultimately these serve only as a starting point from
which experience and learning have to fill in the gaps. So far, natural
language has proven to be too expressive and complex to, for example,
allow us to create computer programs capable of parsing and
understanding it.

I started thinking a lot about this after reading Steven Pinker’s
The
Language Instinct, followed by
Words
and Rules. While in Barcelona, I was lucky enough to discover and
purchase a copy of Jean Aitchison’s excellent exploration of language
evolution,
Language
Change: Progress or Decay?. Her thesis is that language change
represents both progress and decay, and that it’s a natural
process which should be no more questionable than the evolution of a
species.

Some of the major points from LCP:oD -

Language change is, like most natural phenomena, decentralized in
nature. Attempts, such as those made by the French government, to
“command and control” language evolution have proven to be futile.

Change usually creeps in as an exploitation of weakness in a language.
Examples are certain combinations of sounds in varieties of the English
language which have combined into one (jump-ed -> jump’d),
morphed (spoilt -> spoiled), or disappeared completely over
time (Cockney’s bottom -> bo’om).

Most language changes beget compression and increased efficiency for the
speaker.

Once a change is introduced, its spread is often influenced by social
factors. For example, a manner of speech which suggests a higher social
status will spread quickly in middle and lower class communities.

While language change is natural and inevitable, it is sometimes
socially undesirable. Excessive change may be inconvenient if it
linguistically divides two or more communities that were once able to
communicate with each other.
</ul>
What if computer programming languages could evolve, and progress
in programming language technology wasn’t centralized and
controlled by expert computer scientists? How much more expressively and
efficiently could we communicate with computers—and with the programmers
who have to read and modify our code?

There is, of course, the matter of practical reality getting in the way
of something like this immediately happening. The biggest challenge is
that we have to communicate both with humans and with computers.
If computers were smart enough to handle an ever-evolving language, we
probably wouldn’t still have a field of research called Computational
Linguistics.

How could you make evolutionary program language “design” work? The
easiest answer is to say that it already does…to some extent. Bjarne
Stroustrup once
said
that:
"Library design is language design"
By this, he meant that APIs are like little languages that represent a
specific domain. Each has its own “manner of speech”, so to speak. If
this is true, then you could easily see how the same compression,
efficiency and social status drivers that affect language change could
impact the way APIs are designed or even how programming styles differ
from programmer to programmer.

You can be sure that, when designing
Ruby,
Matz
never directly intended for his language to lead to the following manner
of programming:
a='p=0;t=""*30000;';$<.each_byte{|c|a<<({?>=>"p+=1;",?<=>"p-=1;",?+=> "t[p]+=1;
",?-=>"t[p]-=1;",?[=>"while t[p]>0;",?]=>"end;",?.=>"putc t[p];
", ?,=>"t[p]=STDIN.getc||0;"}||"")};eval a
And, there’s definitely a big difference between this code snippet and
the following example of how to create a sidebar in the (also
Ruby-based)
software that
powers this website:
LinkHolder.new("Some Links",
"http://www.rubycentral.org" => "Ruby Central",
"http://www.chadfowler.com" => "Chad Fowler",
"http://languagehat.blogspot.com" => "Artima"
)
They’re both Ruby, but the similarities stop there. I’d go so far as to
say that the first program (which happens to be an
interpreter
for an obfuscated programming language) is no more immediately
understandable to the average Ruby programmer than Spanish is to the
average English speaker. Though you might not go so far as to say these
are two different languages, it wouldn’t be too far off base to
think of them as two different dialects. It would be interesting
to know how much API design and dominant programming style affect the
designers of languages and ultimately, though indirectly, set the future
direction of programming language design. Maybe there is a touch of
evolution indirectly guiding our language designs.

As I said, the “Library design is language design” argument gives us an
easy answer. We see in its evolution the same kind of subtle innovation
that we see in spoken language. But, the world of computing is rarely
characterized as being a breeding ground for subtle innovation.

In a recent interview,
Alan Kay, the inventor of Smalltalk, said that “People today aren’t
doing a lot of work to move programming to its next phase.” He’s
disappointed to see that innovations he and his team made in the early
70s still seem “modern” today. Similarly, at MIT’s
Lightweight Languages Workshop
2002, Microsoft’s Todd Proebsting
argued that the
industry needs to start looking at “Disruptive Programming Language
Technologies” (read: non-linear innovation) and stop focusing on
optimization (beating) of the same old paradigm (near-dead horse).

So, maybe there’s something to be said for changing the way we go about
designing programming languages.

Having said all this, I still don’t have much of an idea of how to
overcome the practical problems of decentralized programming language
evolution. But, here are a few thoughts for starters:

Learn from our syntax errors. What are the most common mistakes
that programmers make? Why? In some cases, it might be carelessness, but
if you could capture all of the mistakes made in the world of, for
example, Java programming, there might be some patterns that show us a
“weakness” that could be exploited. If you think about it, programming
errors are an important source of feedback to the language designer that
are currently being lost by the ton each day. How different might Java
look if Sun could just see how often I screw up my Java programs because
of all of those damned parentheses I generate having to type-cast
everything?

Bring language design to the masses in the same way that the
Quake engine brought 3D game design to thousands of college freshman
with too much free time. Surely there has to be a more user-friendly
tool for language implementation than Bison or Antlr. Maybe we’ll look
back 15 years from now and laugh about how hard it used to be to make a
new language in the same way that we look back now and laugh about how
hard it used to be to do what Excel makes simple today.

Actually build an adaptive interpreter or compiler. By this, I
mean an interpreter that will learn from programmers’ mistakes and adapt
to nuances in programmers’ styles.
Value
rigidty mixed with a healthy respect for reality make this one
pretty tough to swallow, but as Proebsting says in his “Disruptive
Programming Language Technologies” talk (not in so many words), the
disruptive technologies of the future are the ones that seem ridiculous
to us now.
</ul>

It would take someone a lot smarter and more patient than me to actually
do anything with these ideas. But, (as Jean Aitchison puts it) “social
undesirability” aside, the idea of programming in Pidgin Java or
SmallPerl\# Creole appeals to me in an exciting and indescribable way.

(Off Topic: For those who’ve read it, am I the only
one who has been paranoid about everything I write since reading Glenn
Vanderburg’s
post
about abstract nouns?)