User login

Navigation

Why Programming Languages?

When I present my research work on programming languages, people often ask me "why do you need a new programming language to solve this problem? Why not just implement it as a library?" Or, I get asked "why didn't you implement it as an extension to {some existing language}?" In this essay I try to make explicit some of the goals and motivations behind language design.

Van Cutsem is the author of AmbientTalk, which has been discussed obliquely on LtU a few months ago.

Comment viewing options

Great and very timely as I try to argue for my own new language (YinYang). I would encourage an editor to promote this to the front page. Basically his argument is made in four points, excerpted from the essay:

Language as syntactic abstraction mechanism: to reduce repetitive "boilerplate" code that cannot be abstracted from using another language's built-in abstraction mechanisms.

Language as thought shaper: to induce a paradigm shift in how one should structure software (changing the "path of least resistance").

Language as a simplifier: to boil down an existing paradigm to just its essential parts, often to increase understanding and insight.

Language as law enforcer: to enforce important properties or invariants, possibly to make it easier to infer more useful properties from programs.

These are all important, but to me the second, as a thought shaper, is the most important. Pragmatic multi-paradigm languages like C++ and Scala fall down in this regard, they do not try to shape how we think about problems by giving us a lot of flexibility.

Almost none of these points answered "why didn't you implement it as an extension to {some existing language}?" -- sufficient syntax can often be done as a library and simplification is more typically a tool for accelerating language research than the design of consumed languages. This isn't a bad thing: the points still hold, and the framework vs. language design distinction is irrelevant from the view of an approach of providing system-wide properties.

Once we get passed that, I've been thinking more extrinsically (e.g., sociologically) for the past couple of years: languages enable us to communicate with hardware and other developers to create something that works with an end user. Each of these aspects -- running on real hardware, working with other developers, and making software good for users -- still requires a lot of time and is a moving target. We have a ways to go in improving languages to better enable us to do them (and, from a socio-technical gap perspective, indefinitely).

I think 'language as a law enforcer' would help answer that one. A new language may be a restricted subset of other languages with regards to certain aspects. For Ambient Talk, this especially regards the concurrency model.

I chose Haskell for my RDP development because I can structurally enforce new 'laws' in Haskell using monads, comonads, arrows, etc. I ended up favoring generalized arrows because I want to control even the few implicit effects Haskell usually allows (divergence, delay, synchronization) and maintain ability to serialize or integrate the 'functions'.

Re. framework vs. language design: I agree that in general these are very similar activities, especially in the sense that I believe a good framework, like a good language, should exhibit a large degree of "Conceptual Integrity" (to use Fred Brooks's term).

I disagree that the distinction is irrelevant for providing system-wide properties. Quoting from the essay: "To put it bluntly: no amount of library code is going to turn C into a memory-safe language. Just so, no amount of library code is going to turn Java into a thread-safe language." IOW: a library/framework cannot provide a system-wide guarantee that the language itself cannot provide. It can, if the programmer sticks to the framework's rules. But the framework's rules, unlike the language's rules, are not enforced. I believe that's a qualitative difference.

Agreed. A framework is a language right? The nice thing about PL design is that you can control/direct thought through syntax and semantics.

You can enforce rules in a framework, dynamically at least. It depends on the extensibility of your host language, with meta-programming you can do more. Still, you have to deal with the stigma of your host language: if you build a framework for a language like Scala, programmers will make the unjustified assumption that your framework is compatible with someone else's framework, which is rarely true.

You can enforce rules in a framework, dynamically at least. It depends on the extensibility of your host language, with meta-programming you can do more.

Totally; modern languages get us pretty far! Both popular dynamic (e.g., python) and static (e.g., C#/LINQ) systems let you "import antigravity." Even when their other mechanisms are insufficient, you can still work at the AST level to add a new sublanguage. I don't think this is a short-term phenomena, though hopefully the mechanisms are :)

no amount of library code is going to turn Java into a thread-safe language

I don't actually believe that (e.g., you can create new threading abstractions and make usage of anything else in that context incorrect, which would not be hard to detect). Most notions of language-level thread safety will prevent high-performance code so we could even accept the performance cost of a library level implementation without scaring away the demographic :)

More importantly, for an optimizing implementation where we need compiler support, I don't see why this can't be achieved through language extension rather than starting from scratch. In the case of somewhat performance oriented code, starting with a new compiler and VM will likely lose you much of the performance benefit despite better threading abstractions.

In some cases, we really do need to start over. However, that seems pretty rare. Perhaps there's an argument that non-embedded little languages have a role that has been overlooked, e.g., data languages, but I'm not sure what piece of the pie that really is. Even in the data language case, while the data format may be independent, the PADS project showed how much of the manipulation can (and probably should) be embedded (as in PADS/ML).

Still, you have to deal with the stigma of your host language: if you build a framework for a language like Scala, programmers will make the unjustified assumption that your framework is compatible with someone else's framework, which is rarely true.

Yep, that's one reason I don't actually like the above AST approaches to DSLs: achieving interop is hard so the DSLs are often islands to themselves (e.g., see SEIJITS for HPC in Python). However, the alternative of starting completely over will guarantee no interop at all; with language extensions and libraries, you still have a chance. Erik Meijer's Used Language Salesman essay notes that interop is one of those PL research areas that is preventing adoption and inherent to the little language design model. We brush away the cruft to focus on the core problem... but, typically, when someone else brings back the cruft for a 'real' language, we find that real problems were brushed away as well.

Most notions of language-level thread safety will prevent high-performance code

I'm not sure how we count 'most notions', but to the extent that performance is utilization * efficiency * scalability, there are many safe language-level concurrency models that aren't at all incompatible with high performance code, and may even support performance.

The point you're making has been pervasively discussed on LtU before. It's a question that is raised very often. Should we design minimalistic languages that allow to build the necessary abstractions on top of a small kernel, or a more "complete" language which promotes a more unified way of doing things? Should we aim for a "complete" standard library also? Which part of the semantics should be handled by the language definition, and which part by the libraries that surround it? Is it good or bad to have multiple competing third-party "frameworks" that extend the standard library in specific, possibly specialized ways?

(I'm sorry, but I didn't found more-than-mildly relevant references to point out when searching the LtU archives.)

What I would like to point out is that it is very dangerous to think that, by being at the "language" level, voilÃ , you're safe, you have "system-wide" something that everyone can rely on, perfect uniformity (at the cost of loss of flexibility): no matter what you do, "system-wide" will not be wide enough, and there *will* be points of frictions between different, difficult to intercommunicate concepts.

That's true when going "higher" or "lower" level. When considering programming languages in the large, there will always be specialized domain where your language definition doesn't impose nor propose a specialized way of doing things. People will use whatever abstraction flexibilities you have to design specialized frameworks, "domain-specific languages", whatever, on top of your language, and at this level there will be rule incompatibility issues. Going down, your language will have to communicate with other languages (running on the same runtime or not), the operating system, and generally other environments that will have different ways of doing things, and users will want to communicate with those "outside the system" systems (eg. Scala designers going through hoops to accommodate Java interoperability).

So in both cases, even when designing a language, you can't hide and think that you are in an isolated world where only your semantics matter (vs. outside systems, or specialized, refined semantics domain inside your language). When designing a language, you have to care for the case where "system-wide guarantees" break or are absent, as well as when you design a framework.

One reason I favor 'language as a law enforcer': languages can be designed to support certain optimizations, but achieving this requires both that certain invariants hold and that the optimizer be designed to recognize and leverage them. The latter is not feasible for a library or framework most mainstream languages, though others (e.g. Haskell with GADTs) do support it.

Interop - integrating effectively with sensors, actuators, and software systems - is a critical feature for useful languages. Rather than language extension (in order to leverage FFI or whatever), I would favor capabilities. We can robustly interop and adapt many languages and systems via capabilities, without tying ourselves to any particular 'host' language. It will be easier to protect invariants on both sides by explicitly modeling the membrane between languages.

could one have a compiler-defined standard format for known desired invariants, and then libraries could put them in the 'javadoc' as appropriate, and the compiler would take them as given, and apply relevant optimizations? (like, nothing in haskell proves your monad abides by the required laws.)

Using 'hot comments' is a particularly brutish approach to annotating code for optimizations.

In Haskell, you would achieve a similar feature using static rewrite rules, possibly combined with identity functions for discretionary optimization. Static rewrite rules are applied only if types are maintained, but the compiler does not bother proving equivalence.

Unfortunately, annotations are inflexible. They tend to couple optimization with the code rather than its context. Optimizations utilizing idempotence or commutativity, or that trade space for speed, depend very heavily on context. If you do annotate optimizations, you'll eventually want to abstract the optimization strategies from the main body of code.

Developing a monad in Haskell is not analogous to developing a framework in Java. The designer of a Haskell monad can constrain the available 'monadic' vocabulary, and thus force users to work with a subset of the Haskell language. The designer of a Java framework doesn't have that freedom.

Suppose the goal is to make the code more literate, self documenting, or reusable by various automated transformational tools. I see those purposes as distinct from the main thrust of the other categories.

Perhaps we can say there is a difference but strong relationship between language design and the design of tooling that supports the language. For example, consider literate programming: its pascal (language) + a special format to accommodate a tool that can extract documentation. Language design can support tools that operate on code directly; such as static analysis.

If the tool comes after the language is designed as is often the case, then obviously the tool didn't inform on language design. Co-design of tooling and language is something that I think is becoming more popular; e.g., IDE design is becoming more intertwined with language design.

Good point. It's also tightly related to the other categories: the Smalltalk and Self IDEs were definitely instrumental in keeping those languages simple. And in terms of "language as syntactic abstraction mechanism": you can think of interactive IDEs/tools as extending a language's syntax beyond the mere textual domain.

Aspect oriented programming is an example of IDE like functionalities that are considered part of some languages. So I think rather than saying this is just about tool support, it's more powerful to think of this notion as being about the existence and implementability of convenient algorithms for iterating over and transforming "interesting chunks" of source code in interesting ways, explicitly or implicitly. Syntactic, semantic, and pragmatic features of a language are all potentially relevant to determining the character of the possible transformations, how they can be controlled, and specified.