This Exegesis explores the new subroutine semantics described in Apocalypse 6. Those new semantics greatly increase the power and flexibility of subroutine definitions, providing required and optional formal parameters, named and positional arguments, a new and extended operator overloading syntax, a far more sophisticated type system, multiple dispatch, compile-time macros, currying, and subroutine wrappers.

Now while this sort of article isn't about the science of programming languages, it sure is about the craft of programming language design (see our recent discussions).

Since we are a fairly young science, I think there's still good reason to look hard at what people are actually doing, even if large parts of what you find turn out to be less than useful. There are plenty of good ideas that started out by professionals who had valuable intuition.

Of course, the same argument works in the other direction: language designers are mostly amateurs, they should try to follow academic research in order to ground their work in solid theory. It's nice to see Perl discussions about named parameters, constant views of variables, currying and so on. Let's not forget that, differences aside, we are all betting for the same team!

It's gratifying to see that the very first example in this Exegesis is of a high order function...

But don't be put off by these myriad new features and
alternatives. The vast majority of them are special-purpose,
power-user techniques that you may well never need to
use or even know about.

The PL/I culture made this same claim. "What you don't know
won't hurt you." The Perl5 culture made that claim, too.
In both cases, it seems to me the problem is that (practically)
when you provide a plethora of ways to do some task, not all
the ways wind up being *equivalent*, in terms of side effects.
So, yes, when you don't understand *all* K ways of doing
task X, the language rises up and smites thee,
because you chose Way W3 that has different (subtle, undocumented)
side-effects than Way W1, which looked superficially equivalent.

Ehud closes with:

Let's not forget that, differences aside, we are all betting for
the same team!

Is that true? Even vacuously? Suppose our "same team objective" is
to make a better programming language. Ichbiah's Ada Rationale didn't
sound like Wall's Apocalypses. (Or maybe it's been so long since
I've done Ada, that I've come think of it as immensely saner than
Perl v[456], just out of nostalgia.)

This comes back to the "power vs utility" argument of a few days ago,
regarding Java macros. Consider:

1 You, Programmer, can't handle a lot of power, so I, Language
Designer, will not provide it for you.

2 You, Programmer, will have difficulty with very powerful
constructs, so I will make them available but harder to shoot
yourself in the foot with.

3 I, Designer, really don't care what You can or cannot
practically handle, so I will fill the language with all
manner of Powerful things...

No one seemed overly enthused about #1. Somewhere between #1
and #2 is probably where Ada is, IMHO. #3 is what a Paul
Graham would advocate.

Given that most programmers are not "well above average" for any
reasonable definition of average, I have to wonder who the
intended clientele for Perl6 is...?

There's more to choose from than #1, #2 or #3. What about a small language that has a small core of very powerful constructs. So then there are thing you have to learn and understand, but there are only a few, and once you master them, you master the whole language. I think that's what Paul Graham would advocate. "all manner of powerful things" doesn't sound like his style. It's more like "just the right set of powerful things".

Many of the things that you can do in Perl6 you can do in Perl5. Usually in two or three different ways via various CPAN modules. By moving some of this into the core you get a standard mechanism which actually makes the language more consistent as a whole.

You're also getting a more logical and orthogonal language design, which (IMHO) is going to make it a lot easier to learn.

From what Larry and Damian have revealed about Perl5 so far the nastier bits of my Perl5 codebase are going to be a lot simpler once Perl6 comes along.

Then there's the distribution-of-functionality question.
Where do you put the functionality: in the language core,
or into the standard library (of functions, classes, types,
procedures, reuseable bits and pieces)? From an implementor's
perspective, a small core is less work to implement. E.g.
there are many standard-compliant Scheme implementations, but
only a few standard-compliant Common Lisp implementations.
From a programmer's perspective? ---hmmm. There are a lot
of relatively featureful Scheme implementations (with richly
useful function libraries), but little in the way of interoperability
*between* implementations. OTOH, if you write standard CL,
any standard-complying CL will grok your code.

Once a language escapes the hands of its creator(s),
and gets adopted by lots of Real(tm) Programmers, either [1]
the language itself gets larger, or [2] the standard library
gets larger, to make it easier for people to use the language
to solve their problems. Common Lisp went down the road of
adding the functionality you'd want, into the language itself.
Scheme went down the minimalistic-core road. There's a long
history of programming design philosophy (starting with the
Algol 60 Report, and continuing at least through R5RS Scheme)
that says a minimal core is better, aesthetically --- but it's
not clear to me, from an everyday programming perspective
that either approach is intrinsically superior (in terms of
helping me to create good, working code faster).

"You're also getting a more logical and orthogonal language design, which (IMHO) is going to make it a lot easier to learn."

Yes. This is one of their primary design goals, it seems. They're trying to provide a lot of power, but do it by having unified concepts, and then providing a whole bunch of different-looking ways to access those concepts. So every bit of code is in a block, but that block can be anonymous, be named as a subroutine, be a rule, etc., but they're all just blocks. I expect the whole of Perl 6 to be much more easily comprehensible than Perl 5 is. I'm also sure that Perl 6 has a much different design philosophy than Paul Graham has. :)

"language designers are mostly amateurs, they should try to follow academic research in order to ground their work in solid theory"

The design team seems to be following a much more "high brow" design for language interpreters than is typical. For instance, they've recently switched to using continuation passing style as integral to Parrot function calling.

I think this discussion relates to the recent discussion of Java macros. If the language core is small, its syntax and semantics can be completely understood by its users. Additions using libraries will still (?) obey the well-known syntax and semantics of the small core language.

C++ takes the large language approach. Few C++ programmers understand all of C++, so they write in a subset of the large language. The problem is that they use DIFFERENT subsets of C++. So C++ and Perl programmers must understand ALL of the Many Ways To Do It before you can read other people's code. I think Perl takes the large language PLUS large library approach. ;-D

Richard Gabriel (Mr. Worse-is-better) writes a lot about large language versus large libraries in his book "Patterns of Software: Tales From The Software Community". The title is silly, but I find most of his essays about languages very interesting. For the well-read LtU reader, though, they probably do not offer much new insight.

The design team seems to be following a much more "high brow" design for language interpreters than is typical. For instance, they've recently switched to using continuation passing style as integral to Parrot function calling.

Oh, fooey. Continuations aren't highbrow--they're dead simple extensions of closures and function calls. The only reason people freak about them is they're normally presented in a way that ensures people are deeply confused about them.

Larry sometimes gets highbrow, but I don't. I'm too often a Programmer of Very Little Brain, and getting highbrow for too long makes my head hurt. Parrot does a few things that can reasonably be considered clever, but they're either optional (the JIT, for example), or the least annoying solution available. (like the unified event/IO/signal system)

Languages generally don't get larger once released, at least not quickly. With very rare exceptions (Lisp macros spring to mind) the average programmer can't alter the language semantics if he or she wants to, as that requires altering the compiler, and that's just not a viable option. The language semantics you start with are the ones you're stuck with until some language committee (either standards base or the main development group) decides to change things. That means that if you want additions you must go the library route, since that's all that's available. Might be functions, might be classes, but the result is the same--the semantics you start with are the ones you're stuck with.

For many features, that's not a problem--a library is the right place for the functionality. There's no real reason to make socket stuff part of the core language semantics instead of library code, and even if you do put it in as core semantic you'll likely not notice that it's core and not library code.

Other features, primarily flow control, can't be done in libraries, or can only be done very awkwardly. Adding exceptions or closures to a language with no flow control more advanced than if/then, gosub, and goto (to pick a really extreme example) with only library functions would be at best terribly awkward.

You can see this in regular expressions. While you could build up a regular expression object with a series of "matchexact/matchwild/startgroup/endgroup/whatever" function calls, it's horribly awkward, and everyone opts for the traditional regex syntax because it's so much easier and convenient to deal with. Many languages still approach it awkwardly, as you have to make function calls to handle the regex. Having base syntax for regexes makes things flow easier, but if it's not in from the beginning, you're stuck with function calls.

What Larry's trying to do is take those semantics that were either damned awkward in their current form or just not available and front-loading the language with them. We know that this stuff goes in now or not at all. (Or possibly much later, but shoehorned in and inevitably done in an awkward and/or annoying way) That's sensible--you look at what people are using now and refactor the crappy bits while integrating the stuff you couldn't reasonably do before and try to get some coherence to it.

Now, you could certainly argue that the breadth of semantics that he's pulling in is excessive, or that he's going about it in the wrong way, but that's fine. There are plenty of languages with a more restricted set of base semantics, and the fact that Perl 6 has a very broad set doesn't magically corrupt Java or Fortran.

Is that true? Even vacuously? Suppose our "same team objective" is to make a better programming language. Ichbiah's Ada Rationale didn't sound like Wall's Apocalypses. (Or maybe it's been so long since I've done Ada, that I've come think of it as immensely saner than Perl v[456], just out of nostalgia.)

I think it's true. Programmers, esp. those doing real life work, want better tools. It may take time for people to learn new techniques and languages, but you can perusade them (if your point of view is based on something more than personal preference). So yes, we are all trying to advance the state of the art.

Obvsiously, people like Larry have an agenda: They have a horse in the race. But for the most of us, I think we are willing to change our minds. Even language designers change their minds (or is come to their senses). Golly, we may even have parameterized types in Perl...

I don't like the distinction being made here between "core language" and "standard libraries". For me, standard libraries are a part of the core language. Some functionality that is implemented in some PLs as part of the core is implemented in others as part of a library, e.g. Pascal write vs. C printf.

I also don't accept the distinction based on ability to do flow control. There are numerous flow control extensions to JavaScript in BeyondJS, and it certainly is a library (not even a standard one ;-)

Obviously not all PLs are equal in what they allow a programmer to do, thus also what they make possible in libraries. But that in itself does not define a general distinction.

BeyondJS has flow control extensions, but they do feel awkward, as the other Dan said.

The distinction is important. The core is what you have to know and understand to use the language. You only have to be aware of the existence of libraries, as the exact details can easily be looked up.

Junctions for types start to look a bit like Haskell data constructors...

I originally thought that the any junction was itself a way of defining types, e.g. any(1, 2, 3) would be the type of all values that were either 1, 2 or 3. Other possibilities (most certainly not valid Perl 6)...:

intGreaterThanFive = any(int a where a > 5)
evenPosInt = any(int a * 2 where a > 0)
evenIntGreaterThanTen = any(evenPosInt a where a > 10)
evenIntGreaterThanTen = any(intGreaterThanFive a * 2)

You need to know that language in order to understand the libraries. You don't need to know the libraries in order to understand the language definition.

Don't have much time now, but I disagree. C++ with STL looks very different than C++ without. Someone who knows C++ but has never seen an iterator or an algorithm will be lost (and it doesn't help that STL source is totally unreadable, you may not even have it). One might consider C++ with STL to be a different PL than C++ without it. Just compare this to this.

I reiterate my position that the standard libraries are part of the language definition, and that the distinction between standard library and core is technical at best.

BeyondJS has flow control extensions, but they do feel awkward

Oh, and what is awkward about the control flow extensions in BeyondJS? I've grown to find them more natural than those provided by standard JavaScript. The only awkward thing I find about them is the verbose lambda syntax in JavaScript.

There are two orthogonal ways of looking at this issue: standard vs. implementation specific and language vs. library.

The language should be as small as possible but no smaller, allowing all kinds of expressive features to be implemented on top of it. If it's extensible enough you can push much of the "language" features into libraries. Once you have such language yous should create standard libraries on top of this primitives, so every implementation of the language will provide the same funcionality.

Using this approach we could get more language implementations with less problems on implementing libraries: you can use the standard implementation instead of a specific one. If you want more performance you can reimplement some parts of the standard library using implementation specific knowledge.

Why yes, yes in fact we do expect every language provider to compile to our bytecode, or at least one of our assembly languages. We also expect that any language that wants to make use of our security and quota system, or interoperate with any other language that compiles to Parrot, to use our calling conventions which mandate CPS.

Having said that, the only difference between CPS and a more traditional call/return style is that you call a sub with CALLCC rather than CALL, and return with INVOKE P1 (P1 being the register that a sub/method/function gets its caller's return continuation in) rather than RET. If the thought of continuations gives you a screaming fit, you can even pretend that we just pass the return address in rather than put it on the stack and mis-spelled RETURN.

It isn't tough and if someone can't manage that, odds are they're not up to writing even a simple compiler so the issue won't arise.

This is an assumption universal runtime implementors should challenge.

Yeah, and we should floss regularly, change the oil in our cars every 3000 miles, and not threaten death to people who think multiple nested ternary operators are a good idea in production code. Lots of shoulds, but life is short and I've a limited attention span.

Show me it works, and maybe for the next big redesign we'll do it, or write a translation layer for the current architecture. I am unconvinced that this is the right way to do it, as the 'generic' intermediate form should look rather different for a stack or register target.

(Yes, this is a not-too-rude way of saying "you first, and prove it works")

(Yes, this is a not-too-rude way of saying "you first, and prove it works")

Thanks for not being too rude.

Kelsey's PhD thesis is pretty good evidence it works. When you have Parrot running as described then implementing Kelsey's thesis would be something I'd consider.

The intermediate form does not look different for stack or register targets since that form is higher level... just function application.

There would be a different set of transformations to get from function application to the lower level stack or register targets. But taking this approach for Parrot alone would be a win. And in any case, those transformations are written once, not by each language provider.

I have glanced over Kelsey's PhD and I agree with Patrick. Compiling Loell to a Scheme dialect seems a lot less daunting task than compiling it to an assembly language. Even if only because the compiler output is easier to read and check.