This page is intended to be a sort of clearing house, to list categories of tradeoffs, give summaries thereof, and provide pointers to further information and discussion. ThreadMode should be aggressively reduced to DocumentMode.

Also this is not an advocacy page; there are lots of those.

Analyzability Versus Expressiveness

Is it more important to be able to formally analyze language expressions, or for them to be as expressive as possible?

Purely declarative languages make it easy to formally manipulate and analyze constructs, but are either not TuringEquivalent, allowing only a choice from a predefined set of possibilities (e.g. declarative user interfaces), or when TuringEquivalent, require adapting algorithms to a particular paradigm (e.g. Prolog).

"This trade-off is not a temporary state of affairs, to be solved by some ingenious new approach. It is a deep property of computation models. As a language becomes more expressive, its programs become less amenable to formal manipulation. This is illustrated by [the unsolvability of] the halting problem." (ConceptsTechniquesAndModelsOfComputerProgramming, p681)

HindleyMilnerTypeInference is used in many functional languages such as Haskell precisely because it is sufficiently powerful to be interesting, yet limited enough to be decideable. It is infamous that more expressive TypeInference schemes are, in general, not decideable, preventing guaranteed type safety at compile time. Which is more important in your language?

Afficionados of very strongly typed languages like them because there is no possible way to subvert the type system within the language, allowing certain formal guarantees to be made. Proponents of languages that allow an escape to be made from the type system (as in C/C++) or which defer the whole question to runtime (DynamicTyping) prefer the greater degree of expressive power.

Contrary to the claims of fans of particular approaches, there is no absolute right or wrong to these decisions. There are just consequences of the trade-offs, and a judgement of the suitability of those consequences for a particular purpose.

Is the language primarily going to protect programmers from themselves by making certain categories of mistakes impossible (Java, Ada, Pascal, C++, BondageAndDisciplineLanguages), or is it primarily going to be as expressive as possible to maximize the power of the programmer (Lisp, Ocaml, RubyLanguage, PythonLanguage)?

There is a possibility that this trade-off is not necessarily inherent, and that a way to have safety without losing any expressiveness (even in terms of systems programming) may be found. The experimental CycloneLanguage is such an attempt.

Part of the reason that this is the traditional tradeoff is that expressiveness is often valued by self-styled pragmatists, while safety is most often stressed by self-styled purists/ideologues; the latter have frequently said quite explicitly that loss of expressiveness is acceptable, or even desirable, if it leads to greater safety, while the former camp have frequently explicitly expressed the opposite.

This means that attempts to have maximum safety combined with maximum expressiveness have been very much in the minority compared with efforts to simply trade one for the other.

It's also true that it is relatively rare for maximum-safety advocates to address e.g. SystemProgramming needs.

'why is Ocaml so much more fun than Java? Why are "languages designed for smart people" (LFSPs) so much more fun to program in than "languages designed for the masses" (LFMs)?' Mike Vanier on paulgraham.com; http://www.paulgraham.com/vanlfsp.html

'Your development cycle is much faster because Java is interpreted. The compile-link-load-test-crash-debug cycle is obsolete.' -- James Gosling (i.e. Java leads to fewer bugs)

'Java was, as Gosling says in the first Java white paper, designed for average programmers. It's a perfectly legitimate goal to design a language for average programmers. (Or for that matter for small children, like Logo.) But it is also a legitimate, and very different, goal to design a language for good programmers. Languages designed for average programmers have to put safety first. Expert programmers, on the other hand, care only about power, and are going to be annoyed with any language that gets in their way in the name of safety.' -- Paul Graham

Currently these quotes are skewed towards Paul Graham's obvious preference. Highly quotable pro-Java/safety quotes/sound bites would be useful here. (Not an argument, note, just a nice terse quote supporting the safety point of view; there's no absolute right answer here, it's just a tradeoff)

Safety versus Expressiveness is a false dichotomy -- you can have both. Compare ObjectiveCaml with CeePlusPlus: OCaml obtains expressiveness without compromising safety, while C++ obtains it by throwing away safety. The latter is just bad design. -- DavidSarahHopwood

It is not a false dichotomy, you are just blinded by the shining light that is the glittering pure perfection of OCaml, the single exception to this, and all other tradeoffs. Faster than assembler, more OO than Smalltalk, more expressive than English, safer than a mother's kiss -- what's not to like?

A scripting language will primarily run directly from textual source code, so that it can be directly embedded in a web page (Javascript) or in script files (shell, Perl) or "here documents", with no preliminary translation step required.

Non-scripting languages may be either compiled or interpreted (e.g. Lisp), but as an implementation issue rather than a language design issue.

I would state that when selecting (one or more) languages for a new project, language "speed" ranks very low in importance. For most operations, the differences in languages are minor and often masked by processor speed and other design criteria.

Does this apply when designing a language too, i.e. is speed something that a language designer/compiler writer should worry about? Basically, I want to know if I have to worry about really fucking up my language speed-wise through language design decisions?

The research I've done on ContinuationImplementation, KeywordParameterPassing, ImplementingMultipleDispatch seems to indicate that most features, no matter how advanced, have fairly small performance penalties. Continuations can be had for a 5-10% slowdown across the board, keywords are just an array lookup, MultiMethods can be done in quasi-constant time, though the constant is fairly large (an array lookup for each active selector in the method). I'm wondering if this squares with your experience, and if there're any particular "gotchas" when implementing a compiler, things that could make it run multiple factors slower.

But it seems like speed considerations do come into play with a lot of language implementors, mostly concerning the cache. Dylan introduced sealing so that the appropriate method could be selected at compile time, eliminating the run-time dispatch entirely. Dispatch-DAGs for multimethods are preferred because dispatch tables require a lot of non-local memory accesses, potentially blowing the cache. One of the main reasons ParrotCode uses a register architecture is that stack machine have a higher memory load, and hence cause more cache misses.

Purely my opinion, but focus on the big picture of programmer productivity. Do not ignore compiler speed, but don't give it undue priority either. Correctness is highly important; a single miscompiled statement will probably waste far more of my time than even the slowest compiler. Make sure compiler error messages are clear; I've wasted countless hours tracking down silly hours but were not clearly reported. Make sure the compiler has a way to escape endless loops; I have a preprocessor from a major vendor that never returns if it is given certain typos. Lastly, define a good benchmark. Don't load the benchmark up with worst cases, make sure it reflects typical use; otherwise you will spend time tweaking unimportant cases.

I remember the days of start the build and go have lunch. For the size of projects I am currently involved in, compiler speed is not a significant differentiator, it is usually the speed and ease of use of other parts of an IDE that cause me irritation in development. --WayneMack

If one doesn't pay attention to both the algorithmic complexity and the number of machine cycles that a language consumes in its operations, then it will be too slow for some purposes, and a different language will need to be used for certain kinds of things. Efficiency requires close attention. To what extent that matters a lot or not at all depends on the goals. -- DougMerritt

I'm confused. What do you mean by "language speed" ? A fast compiler ? Or fast applications ? I've been told that many "features" of Pascal were designed to make the compiler faster. The obvious issue with slow applications is when an algorithm uses inappropriate data types, because the language blocks us from using better data types (or makes them too much hassle). But a possibly more important issue is helping the programmer trying to find a better implementation -- for example, support for profiling so a programmer can ProfileBeforeOptimizing. Hardware support for single-stepping machine language programs was very important when assembly language was the LanguageOfChoice. -- DavidCary

Introductory Languages

(copied from ComputerProgrammingForEveryone?, which is otherwise a thread mess)

See also DamianConway's paper Seven Deadly Sins of Introductory Programming Language Design (available from his home page). But note that this paper "focuses exclusively on the issues which arise in the context of teaching introductory programming."

Summary of the seven major "sins":

1. LessIsMore: paradigm purity can get in the way of practical problem solving...

2. More is more: ...but you can also have too much semantics to learn

3. Grammatical traps:

Syntactic synonym - multiple ways to do the same thing (i.e. array[i] == *(array+i) in C)

Syntactic homonym - same syntax has different semantics in different contexts ("static" keyword in C)

5. Backwards compatibility: language elements included for familiarity with prior languages have no benefit for a first-time student (i.e. Java's and C++'s C-like syntax and Scheme's retention of assembly mnemonics "car" and "cdr")

6. Excessive cleverness: obvious to an expert but not to a novice (i.e. C/C++ declaration syntax)

LanguageDesigners? must consider mechanisms and methods which implement the easy re-use of previously written language components by extension or whatever appropriate means. A lot of lip-service has been made concerning this issue, but little has been.

This should also be built into the creation process via a declaration that the module, class, or component is to be designed using the language re-use methodology.

[What exactly are you suggesting? For decades many have said that software re-use is the holy grail. For an equal length of time people have pointed out that various purported re-use technologies/methodologies don't go very far; the most successful such technology is still the ancient and venerable subroutine library, but of course we want more. -- dm]

In declaring a component or module as "Reusable", that the language would enforce constraints and requirements on the nature and composition of that component or module to allow for its reuse, regardless of language, platform or development environment. It is a question of defining a new concept of scope in which the programmed module, class, component or routine is recognizable as "universal".

There are 3 steps to providing a reusable component:

Specifying what it does.

Doing it in a way that other programs can make use of.

Making potential clients aware of it.

All of these have problems in current languages. The problems are not intractable, and could probably be improved upon, but they are non-trivial.

The first step requires a very precise interface specification language. The language must allow specification of type, preconditions, postconditions, and semantics. A potential reuser must be able to easily decide if the component meets his specifications.

These don't necessarily all have to be compiler-checked: in fact, compiler checking of semantics implies that the compiler knows the outcome of every possible set of inputs, which, for a TuringComplete language, would run into the HaltingProblem. A sufficiently expressive type system may unreasonably impede the expressiveness of the language; it's the whole StaticTyping vs. DynamicTyping debate over again. For some languages, a clear documentation standard (perhaps similar to JavaDoc or DocTest) would suffice.

Examples of current steps forward might include JavaInterfaces and Eiffel-style DesignByContract. UnitTests are also helpful; they function not only as a verification tool, but as a specification. A potential client can copy the usage of the unit tests.

The second step is largely concerned with ForeignFunctionInterfaces, though there are also issues with SeparateCompilation? and the FragileBinaryInterfaceProblem. Chances are, not all clients will be in the same language; it must be possible to call into library's language. ApiVsProtocol also comes into play here; protocols tend to be much more flexible than APIs, but most library designers write to an API for performance reasons.

A good FFI example might be Melange from GwydionDylan (if they got the bugs worked out...) It parses the C headers to generate the necessary calling conventions. ParrotCode has a decent FFI, though it requires that you specify function signatures manually. And JavaNativeInterface isn't bad, though it requires wrappers on the C side.

Most language designers overlook the social issues of building a language. It's often said that LarryWall created the PerlCommunity?, and the Perl community created the PerlLanguage. A lot of the success of a new language comes from building a community of developers that's willing to invest in writing software for it. There's a SnowballEffect coming from more MindShare? that can often catapult a technically mediocre language into a market leader. Re-use is a large function of that; build a large library and the language is likely to overtake a far superior rival that doesn't have that MindShare?. -- JonathanTang

Thank you Jonathan for your intervention. I was in the process of clarifying the issue and was about to post it when I found your reply. You have done a much better job than I had, so I did not post it. I would just say that the approach I was making was that the issue was primarily that of language design rather than that of Integrated Development Environments. I see reuse as an extra-program activity. Reusable components would be constructed for inclusion in programs, not defined within the program. Once and Only Once is more than a buzzword applicable to data, it should also apply to program components which could be universally specified and designed. Perhaps some of the approaches made with regard to hardware components might be applicable in reusable software components. -- AnonymousDonor

In other words, AllProgrammingIsLibraryDesign?. It makes sense to me, and I think a lot of the things we normally think of as programs would do better as reusable libraries or protocol servers. The PeerToPeer community has started moving that way, with BitTorrent and GiFt?, and Gaim is trying to separate out the IM client from the GUI right now. Things like syntax parsers should also be libraries (they are in CommonLisp), which would make development tools much easier to program.

The problem is much harder than it sounds, however, because in general it takes an order of magnitude more work to develop a reusable component that a one-off program. This is detailed in TheMythicalManMonth, and oft-quoted around here. And it really is hard to design something that will fill multiple needs well. Often times, one program's need is not quite the same as another, so the component needs to be tweaked a bit, which breaks it for the first client, which causes the whole reuse idea to fail miserably.

I should probably also note that Doug is far from inexperienced in this area, and the difficulties he mentions are very real. The software industry has spent the better part of 40 years trying to achieve "re-use", without all that much success. And I suspect that that's inherent in the problem domain: what seems like similar problems to us are "similar, but not enough", and so a solution for one won't really serve as a solution for the other. -- JonathanTang
ReUseTechnologies? ReUseMethodologies?

Although many of us have worked to create good object-oriented programming languages, it would be hard to say (with a straight face) that any of our creations have totally succeeded. Why not? I believe that this endeavor is essentially paradoxical. Thus, whenever a language designer pursues a particular goal and loses sight of the lurking paradox, the outcome is an all too often fatally flawed result. One way to think about this is to explore the following seven paradoxes:

Because programming languages, development environments, and execution engines are intended for both people and computers, they must both humanize and dehumanize us.

Adding a richer set of concepts to a programming language impoverishes its universe of discourse.

Putting a language's cognitive center in a more dynamic place reduces the verbiage needed to accomplish a task, even though less information can be mechanically deduced about the program.

The most concrete notions are the most abstract, and pursuing comfort or correctness with precision leads to fuzziness.

Although a language, environment, and execution engine are designed for the users' minds, the experience of use will alter the users' minds.

Object-oriented programming has its roots in modeling and reuse, yet these notions do not coincide and even conflict with each other.

A language designed to give programmers what they want may initially succeed but create pernicious problems as it catches on. However, a language designed to give programmers what they really need may never catch fire at all.

Many of these assertions seem nonsensical, misguided, or just plain wrong. Yet, a deeper understanding of these paradoxes can point the way to better designs for object-oriented programming languages.

And for completeness' sake, all LanguagesAreOperatingSystems so that brings up a boat-load full of issues which language designers traditionally scorn.