If I Would Design a Language

Warning: This post contains fictional grammar, ideas off the cuff and high levels of programmer nerdery. Proceed at your own risk.

I’ve had a look at the influx of new languages for the JVM the last couple of months. In particular I’ve read up a bit on Kotlin, Fantom and Clojure. And naturally I started thinking about what I would do myself if I designed a new language.

Well… the world don’t need a new computer language. But nevertheless, here’s what I’m thinking right now.

Main Ideas

Here’s what I’d aim for primarily:

Readability: In real software projects in real life you need to understand other peoples code. This means you have to trade-off write-ability for read-ability. It also limits you in terms of variations: there should only ever be one way of expressing any particular statement. It also means I’d probably skip type inference, but more on that later.

Modularity: Ceylon and Fantom all have it right, modularity should be build in from the start. However, it seems to me they are missing a very obvious extension to the language level modularity, which is…

Dependency Injection: This is a kicker, as I don’t know of any language that have it built in. Think of Guice but as a core feature of the language itself. If this could be possible to combine with the language modularity, you could a potentially awesome productivity right from the start.

Concurrency: I’m a server kind of guy, this is important. I think Fantom is on the right track so I’d probably end up with something similar. Ie. an actor style pattern where the only allowed interaction between actors are immutable objects. In fact, I think immutability is going to be a main feature as well…

Immutability: Immutability enables cool stuff: Clojure has this down pat. And although I’m not sure at this point, I think all classes will be immutable by default, and you’d have to declare a class if you want it mutable. People would hate it though.

Hello World

Let’s get right down to it, shall we? We’ll do the standard “hello world” a bit more complicated immediately to highlight a bit more syntax:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

/*

* class is 'public' and immutable by default, the constructor

* signature is immediately visible and has a default value

*/

classGreeter(Stringname="stranger"){

/*

* member is 'public' and has an automatic accessor, it

* is also initiated with the value from the constructor

*/

Stringname=$name;

/*

* function is 'public' by default

*/

functiongreet()returnsVoid{

/*

* 'echo' is a short hand for system out, and

* strings are evaluated

*/

echo("Hello #{name}!");

}

}

So what do we have here? Let’s break it down.

Constructor

The constructor signature is declared immediately after the class name. It can contain default values for arguments. The actual implementation is done either implicitly, or in a “constructor” statement:

1

2

3

4

5

6

7

8

9

10

11

12

classGreeter(Stringname="stranger"){

Stringname;

constructor{

/*

* 'this' denotes a raw field access and is only

* allowed in constructors and setters

*/

this.name=$name;

}

}

Note that this constructor is equivalent with the previous example. Does this mean that a class can only have one constructor? Yes it does, but with default values for arguments, and named argument passing it should be enough.

And by the way, the constructor arguments are accessible through the entire class. So the greeter can be simplified:

1

2

3

4

5

6

classGreeter(Stringname="stranger"){

functiongreet()returnsVoid{

echo("Hello #{$name}!");

}

}

Pretty cool huh? This means there’s an implicit field for each constructor argument in each class, which is readable but not writable.

Class Immutability and Field Access

I’ll leave the collection classes for now, but ordinary classes are immutable by default, and all fields have a generated “getter”, and optionally a “setter” as well. So, to start with, this would be illegal:

1

2

Greeter greeter=newGreeter();

greeter.name="kalle";// illegal!

If we truly want an object that can be mutated, we’d have to declare it as such:

1

2

3

4

5

6

@Mutable

classGreeter(Stringname="stranger"){

Stringname=$name;

}

And now, we can change it post-construction:

1

2

Greeter greeter=newGreeter();

greeter.name="Adam";// legal!

Access to object fields are done via accessors that are generated automatically. If you want to change their behavior you can, but only the “setter”, the access is always reading the field as it is:

1

2

3

4

5

6

7

8

9

10

11

12

13

@Mutable

classGreeter(Stringname="stranger"){

Stringname=$namewithsetter{

/*

* 'this' denotes a raw field access and is only

* allowed in constructors and setters, and there's an

* implicit "argument" which is what the setter was

* called with

*/

this.name=name;

}

}

But all objects have another implicit function attached to fields, namely the clone operator. With this you can get a clone of the current object with one or more fields changed. If our Greeter was imutable, we could do this:

1

2

Greeter greeter=newGreeter();

greeter=greeter.name-&gt;"Adam";

Under the hood this created a new Greeter object which is equivalent with the first, except for the new name. It would be nice if you could chain clone operations as well, but I’ll get back to you on that.

Functions and High Order Functions

We’ve seen a function already and there’s not much to add. The declaration is in the form:

1

function&lt;name&gt;(&lt;arguments&gt;)returns&lt;type&gt;{}

To this we’ll add a few short-hands. OK, I know I said we’d only have one way of articulating statements. But hey, I think these may be worth it. Consider the following example:

1

2

3

functiongreet()returnsVoid{

echo("Hello ${name}!");

}

Mr Eagle-eye will spot two possible ways of putting this simpler: Why use parenthesis when there’s no arguments, it’s pretty obviously a method declaration anyway. And why declare a “return” when it’s void any way. So…

1

2

3

functiongreet{

echo("Hello ${name}!");

}

In addition function should be possible to pass around as any other object. These high order functions comes in two flavors “strict” and “relaxed”.

“strict” functions only accepts immutable arguments and can only return immutable objects

By default all functions are relaxed as this is a great deal easier to work with. But by declaring strict functions we’ll enable functional programming out of the box a such functions will be guaranteed not to have side effects.

Here’s a bit of syntax for you. First, let’s imagine a “foreach” declaration on an interface:

1

functionforeach(function(Titem)visitor)returns Void;

And in a list of Greeter objects you’d call it like so:

1

2

3

list.foreach(newfunction(Greeter greeter){

greeter.name="Adam";

});

“Primitive” Types and Literals

Everything is objects, and no excuse. Integers and Doubles are 64 bit. Strings are UTF-16. Characters are not integers. We have Bytes. So without further ado, here’s the list of literals:

1

2

3

4

5

6

7

8

9

10

true;// boolean

123;// integer

12.3;// double

1b// byte

§0010// bits literal

0xAB// integer in hex

'n'// character

"n"// string

`http://www.google.com` // uri

2s// duration (2 seconds)

The duration comes in the following flavors:

1

2

3

4

5

6

7

8

9

1ps// picosecond

1ms// millisecond

1s// second

1m// minute

1h// hour

1d// day

1w// week

1M// month

1Y// year

In addition to the ones above, here’s some shorthand, taken straight from Fantom:

1

2

[1,2]// list

[1:"Adam",2:"David"]// map

Modularity

A “module” represent a unit of code which belongs to a particular namespace, and which is versioned and is packaged must like a Java “jar” file, ie. in a ZIP archive together with some meta-data. I haven’t thought through the declaration syntax yet, but here’s what we want in a declaration:

Namespace

Module name

Version

Dependencies

DI Configuration

We’ll steal the reference syntax from Maven. So this…

1

net.larsan.test.lang:Core:1.1.0

… would be the namespace “net.larsan.test.lang”, the module “Core” and the version “1.1.0”. The format of the version is probably fixed, but mayinclude a “-SNAPSHOT” postfix option.

Here’s something though: the declaration will not include build information such as “source folder” or similar. That’s mixing apples and oranges, and unfortunately both Fantom and Kotlin are doing it.

Let’s do some imaginary code:

1

2

3

4

5

6

module{

namespace=newNamespace("net.larsan.test.lang");

moduleName="Core";

version=newVersion("1.1.0");

addDependency("com.whatever:Module:2");

}

Now, the module syntax is probably just a shorthand for this:

1

2

3

4

5

6

7

8

9

class&lt;name&gt;extendsModule{

constructor{

namespace=newNamespace("net.larsan.test.lang");

moduleName="Core";

version=newVersion("1.1.0");

addDependency("com.whatever:Module:2");

}

}

The name of the module class would be anonymous and generated at runtime.

Dependency Injection

With modularity in place, DI is just a small step away. Let’s start with a type to use:

1

2

3

4

5

6

classTransformer{

functiontransform(Stringstr)returnsString{

returnstr.replace('k','y');

}

}

Not very useful admitedly. But hang on:

1

2

3

4

5

6

7

8

9

classGreeter(Stringname="stranger"){

@Inject

Transformer transformer;

functiongreet{

echo(transformer.transform("Hello #{$name}"));

}

}

It get’s neater if the transformer was actually an interface:

1

2

3

interfaceTransformer{

functiontransform(Stringstr)returns String;

}

No we can play with polymorphism. I’ll skip ahead a bit here, suffice to say that I’m going to steal straight from Guice, including assisted inject, but this time with closures, so if you want to to imagine how it would look, go ahead. Some thing to note:

By default all types are injectable, given that the module knows how to 1) identify them; and 2) construct them.

Simple identification is made with a “name” attribute on the “@Inject” annotation, and a “@Name” annotation on types.

The great thing here is that we have the perfect place to configure our type binding. Consider the following example:

1

2

3

4

5

6

7

module{

namespace=newNamespace("net.larsan.test.lang");

moduleName="Core";

version=newVersion("1.1.0");

addDependency("com.whatever:Module:2");

bind(Transformer).to(MyTransfomerImpl);

}

Add to this high order provider functions, and make the module extendible for binding overrides (which would be neat for unit testing) and you have something awesome in the works.

Concurrency

We’ll do an actor model with a twist inspired by Clojure. We’ll need two concepts:

“Executor” – Basically your Java thread pool abstraction

“Actor” an actor is associated with a thread pool and has a state object and an inbox for functions

An actor is an immutable object with an immutable state. However, this state can be updated by visiting functions. So, let’s imagine our Greeter from before is immutable, we could have an actor with a Greeter state:

That’s All?

No obviously not, only my eyes are starting to blur. But this should have given you some ideas, if nothing else about my personal taste. The concurrency model probably needs a bit of re-thinking. Should we allow Actors on mutable objects (provided they are cloned initially)? This would make the code one hell lot easier to write and should speed up execution for complex applications. Also actors needs to be able to talk to other actors.

There’s a lot of other stuff left that I can’t be bothered to cover tonight: