Language Oriented Programming: The Next Programming Paradigm

Transformation Language

The Structure Language and Editor
Language together already provide some power. You could use them to communicate
ideas to other people, for example to draw UML diagrams or to write other types
of static documents. However, most of the time we want
our code to do something, so we have to find a way to make it executable.
There are two main ways to do this: Interpretation and compilation.

Interpretation is supported by DSLs to
help define how the computer should interpret the program. Compilation is
supported by DSLs to help define how to generate executable code from our
program. I will discuss support for interpretation in future articles. Right now
I want to show how MPS supports compilation.

Compilation means to take source code and
generate some form of executable code from it. There are many possibilities for
the format of the resulting code. To generate executable code, you could
generate natively executable machine code or bytecode that runs in a virtual
machine. Alternatively, you could generate source code in a different language
(e.g. Java or C++), and later use an existing compiler to turn that into
executable code. Along the same lines, you could even generate source code in
some interpreted language, and use the existing interpreter to execute the
code.

To avoid dealing with such a wide variety
of target formats, our approach is to do everything in MPS. First, you define a
target language in MPS using the Structure Language. This target language
should have a direct, one-to-one mapping to the target format. For example, if
your target format were machine code, you would define a target language in MPS
that represented machine code; if the target format were Java source code, you
would define a Java-like target language. The target language doesn't have to
support all the features of the target format, just as long as there is a
simple, one-to-one mapping for all of the language features that you need.

So now there are two phases to
compilation, a simple translation from the target language to the final result,
and a more complex transformation from the initial source language to the
intermediate target language. The translation phase is trivial, so we can focus
on the more interesting transformation phase. Essentially, the problem is now
simplified into how to transform models from one language to another. But the
source language and target language could be radically different, making
transformations very complex, for example by mapping one source node to many
target nodes scattered throughout the target model. We want to make it as easy
as possible to define transformations, so we need a model-transformation DSL to
help us. In MPS, this DSL is called the Transformation Language.

There are three main approaches to code
generation, which we would like to use together to define model
transformations. The first is an iterative approach,
where you enumerate all the nodes in the source model, inspect each one, and
based on that information generate some resulting target nodes in the target
model. The second approach is to use templates and macros to define how to generate
code in the target language. The third approach is to use search patterns to
find where in the source model to apply transformations.

We combine these approaches by defining
DSLs to support each approach. The DSLs will all work together to help you define
transformations from one language to another. For example, the iterative
approach inspired the Model Query Language, which makes it easy to enumerate
nodes and gather information from a concept model. You can imagine this as
something like SQL for concept models. As a bonus, having a powerful query
language is useful for more than just code generation (e.g. making editors
smarter).

Templates

The template approach works something
like Velocity or XSLT. Templates look like the target language, but allow you
to add macros in any part of the template. Macros are essentially bits of code
that are executed when you run the transformation. The macros allow you to
inspect the source model (using the Model Query Language), and use that
information to ‘fill in the blanks' in the template to generate the final
target code.

In Figure 5, you can see the definition of a template for generating Java code
for a "Property" concept. The template adds field declarations, getters, and
setters for the property. This template is part of the generator that translates
code from the Structure Language into Java.

Since the templates look like the target
language, you can imagine that templates are written in a special language that
is based on the target language. This is in fact how it works. Instead of
manually creating a new template language for each possible target language, we
actually have a generator which generates the template language for you. It
basically copies the target language and adds in all the special template
features like macros and such. Even the template editors are generated from the
target language's editors, so you don't have to hand code them either.

When you use a template language, you can
think of it as writing code in the target language where some parts of the code
are ‘parameterized' or ‘calculated' with macros. This technique helps simplify
code generation enormously. Templates can also be used for other tasks like
refactoring, code optimizers, and more.

Patterns

The model pattern-matching approach gives us a powerful way to
search models, as an alternative to the Model Query Language. You can imagine
patterns as regular expressions for concept models. Similar to the template
approach, we will generate a pattern language based on the source language. The
pattern language looks like the source language, but adds features which help
you to define flexible criteria for performing complex matching on the source
model. You can imagine this approach as a powerful search-and-replace
technique. Again, the pattern languages are useful for more than just code
generation. For example, they would be very useful for writing automatic code
inspections for the source language's editors.

Remember that the Model Query Language, template languages, and
pattern languages are all supported by powerful editors with auto-complete,
refactoring, reference checking, error checking, and so on. Even complex
queries, macros, and patterns will be easy to write. Code generation has never
seen this level of power.

Using Languages Together

The previous section on code generation
raises some interesting issues about how languages can work together. There are
in fact several ways to achieve it. In MPS, all the concept models know about
each other. Since languages are concept models too, this means that all the
languages know about each other, and can potentially be interlinked.

Languages can have different relationships to each other. You could
create a new language by extending an existing one, inheriting all of its
concepts, modifying some of them, and adding your own. One language could
reference concepts from another language. You could even ‘plug’ one language
into another. I will discuss this in more detail in future articles.

Platforms, Frameworks, Libraries, and Languages

Our system for supporting Language
Oriented Programming needs more than just meta-programming capabilities to make
it useful. It should also support all the things that programmers have come to
rely upon from today’s programming languages: Collections, user-interface,
networking, database connectivity, etc. Programmers don’t choose languages
solely based on the language itself. For instance, much of the power of Java
comes not only from the language, but from the hundreds and hundreds of
frameworks and APIs available for Java programmers to choose from. It’s not the
Java language they are buying into, but the entire Java platform. MPS
will also have a supporting platform of its own.

Before I get into the specifics, let’s
talk briefly about frameworks. What is a framework? In mainstream programming,
it usually means a set of classes and methods packaged up into a class library.
Let’s look a little closer at this and see what we can see through the lens of
LOP.

Why do we want to package up classes and
methods into libraries? Most programmers would recite what their professors
once told them and say, “Reuse.” But that just leaves another question in its
place. Why do we want to reuse some set of classes? The answer is because the
set of classes is useful for solving certain types of problems, like
making GUIs, or accessing databases, or whatever. You might say that a class
library corresponds to some domain. Lo and behold, we see the
connection. Class libraries are wannabe DSLs! This sad fact really frustrates me.

Domain-specific languages exist today in
the form of class libraries, except they aren’t languages, have none of the
advantages of languages, and have all the limitations of classes and methods.
Specifically, classes and methods are immediately tied to a specific runtime
behavior which can’t be modified or extended, because that behavior is defined
by the concepts of 'class and method. Because they are not languages, class
libraries are rarely supported intelligently by the environment (compiler and
editor, for example).

Should we be stuck with wannabe DSLs, or
should we have the freedom to use a real DSL when a DSL is called for? Freedom, of course. Any class library is a good candidate
for creating a full-fledged DSL for our platform. For example, all the
libraries in the JDK should be DSLs for the MPS platform. Some of these DSLs
are not so critical at the outset, but others will have a big impact on the
power and reusability of the platform right from the beginning. I want to talk
briefly about the three most important platform languages that will be provided
with MPS: The Base Language, the Collection Language, and the User Interface
Language.

Base Language

The first thing we need is a language for
the simplest programming domain, which is general-purpose imperative
programming. This simple language would support such nearly-universal language
features as arithmetic, conditionals, loops, functions, variables, and so on.
In MPS we have such a language, which is called the Base Language.

The need for such a language should be
clear. For example, if we want to add two numbers together, we should be able
to say ‘a + b’ as simple as that. We won’t need to use it everywhere, but it will
be needed in some part of nearly all programs, wherever it is the most
appropriate tool for the job.

The Base Language is so named because it
is a good foundation for many languages that need basic programming support
like variables, statements, loops, etc. It can be used in three ways. You can
extend it to create your own language based on it, you can reference its
concepts in your programs, and you can generate your code to the Base Language.
There will be various generators available to transform the Base Language into
other languages like Java, C++, etc. Not every language needs to use the Base
Language, of course, but it’s a good starting point in many cases.