Domain-Specific Languages

Develop your own graphical languages with Microsoft's DSL Tools.

10/15/2007

Steve Cook, Gareth Jones, Stuart Kent and Alan Cameron Wills work on the DSL Tools team at Microsoft. This article is an excerpt from their book "Domain-Specific Development with Visual Studio DSL Tools," (June 2007) printed with permission of Addison-Wesley Professional, a technical imprint of Pearson Education.

Domain-Specific Development is based on the observation that many software development problems can more easily be solved by designing a special-purpose language. A Domain-Specific Language (DSL) is a custom language that targets a small problem domain, which it describes and validates in terms native to the domain.

As a small example, think about the problem of finding every occurrence of a particular pattern of characters in a file, and doing something with each occurrence that you find. The special-purpose textual language of "regular expressions" is specifically designed to do this job.

For example, using the .NET class System.Text.Regular Expressions.Regex, the regular expression (?<user>[^@]+) @(?<host>.+) applied to a string of characters will find e-mail addresses in it, and for each address found, assign the substring immediately before the @ sign to the user variable, and the substring immediately after the @ sign to the host variable. Without the regular expression language, a developer would have to write a special program to recognize the patterns and assign the correct values to the appropriate variables. This is a significantly more error-prone and heavyweight task.

Domain-Specific Development applies this same approach to a wide variety of problems, especially those that involve managing the complexity of modern distributed systems such as those that can be developed on the .NET platform. Instead of just using general-purpose programming languages to solve these problems one at a time, the practitioner of Domain-Specific Development creates and implements special languages, each of which efficiently solves a whole class of similar problems.

Domain-Specific Languages can be textual or graphical. Graphical languages have significant advantages over textual languages for many problems, because they allow the solution to be visualized very directly as diagrams.

Domain-Specific Development
Domain-Specific Development is a way of solving problems that you can apply when a particular problem occurs over and over again. Each occurrence of the problem has a lot of aspects that are the same, and these parts can be solved once and for all. The aspects of the problem that are different each time can be represented by a special language. Each particular occurrence of the problem can be solved by creating a model or expression in the special language and plugging this model into the fixed part of the solution.

The fixed part of the solution is written using traditional design, coding and testing techniques. Depending on the size and shape of the problem, this fixed part of the solution might be called a framework, a platform, an interpreter, or an Application Programming Interface (API). The fixed part captures the architectural patterns that make up the domain and exposes extension points that enable it to be used in a variety of solutions. What makes the approach applicable is the fact that you create the variable part of the solution by using a special-purpose language -- a DSL.

The DSL might be textual or graphical. As the technology for domain-specific development matures, we expect to see tools that support the development and integration of both textual and graphical DSLs. People have a range of feelings about which kind of language they prefer. Many people, for example, prefer textual languages for input, because they can type fast, but graphical languages for output, because it's easier to see the "big picture" in a diagram. Textual expressions make it much easier to compute differences and merges, whereas graphical expressions make it much easier to see relationships.

To create a working solution to the problem being addressed, the fixed part of the solution must be integrated with the variable part expressed by the model. There are two common approaches to this integration. First, there's an interpretative approach, where the fixed part contains an interpreter for the DSL used to express the variable part. Such an approach can be flexible, but it may have disadvantages of poor performance and difficulty in debugging. Second, the particular expression or diagram may be fully converted into code that can be compiled together with the remainder of the solution -- a code-generation approach. This is a more complex conversion procedure, but it provides advantages in extensibility, performance and debugging capability.

Visual Studio and Graphical DSLs
Graphical DSLs are not just diagrams. If you wanted just to create diagrams, you could happily use popular drawing programs such as Microsoft Visio to achieve a first-class result. Instead, you are actually creating models that conceptually represent the system you're building, together with diagrammatic representations of their contents. A given model can be represented simultaneously by more than one diagram, with each diagram representing a particular aspect of the model.

The Microsoft DSL Tools make it easy to implement graphical DSLs, and they enable Domain-Specific Development to be applied to a wide range of problems. The DSL Tools are part of the Visual Studio SDK, and may be downloaded from http://msdn.microsoft.com/vstudio/DSLTools.

Directions
A Language Renaissance
By Greg DeMichillie

Programmers building on the Microsoft platform have never had as many choices when it comes to programming languages as they do today. And recently, Microsoft added one more language to the mix -- IronRuby, a .NET version of the popular Ruby language frequently used on Web servers.

In fact, there's something of a programming language renaissance taking place on the Microsoft platform, and it's following the same basic course plied by the European Renaissance of the 14th through 17th centuries. For developers willing to look beyond Visual Basic or C#, a whole new set of options are available.

The 1970s and 1980s saw tremendous activity in the world of programming languages with languages such as LISP, Smalltalk and APL all being developed. But by the 1990s, the Microsoft programming world was dominated by just two languages: Visual Basic (VB) and Visual C++. Both had large customer bases -- in the corporate world for VB and among independent software vendors for C++. Microsoft had large teams of developers working on both products.

This combination of factors made it very difficult for other languages, whether from Microsoft or others, to gain a toehold. For third parties, the task of having to develop their own IDE -- in addition to a compiler and code libraries -- was daunting. For its part, Microsoft figured it had the market covered between VB and Visual C++, and that building yet another language would risk throwing away the large investments in its two flagship languages.

But the introduction of Java in 1995 changed everything. Java proved there was a market for a new programming language -- something Microsoft didn't think existed. What's more, Java slipped nicely between VB and C++, and threatened to steal Microsoft customers from both languages.

The Renaissance Arrives
When Microsoft finally responded to Java with the .NET Framework, it made two critical decisions that laid the foundation for the explosion of new languages that we're seeing today. First, it decided that support for multiple programming languages was more important than backward compatibility. Second, the company decided to turn Visual Studio into a platform for other languages.

Given the importance of VB to Microsoft, it's hard to imagine a bigger gamble than the decision to have the .NET Common Language Runtime (CLR) support as many programming languages as possible. Similarly, Visual Studio is a major competitive advantage for Microsoft -- it's years ahead of most other development environments, but Microsoft made the decision to open it up to third parties.

So a developer interested in creating a new programming language no longer faces the unenviable task of having to build a full set of developer tools and libraries. Now they can focus on just creating languages that solve interesting problems or make certain kinds of programming tasks easier. At first, with the set of languages for .NET consisting mostly of various incarnations of COBOL, it didn't look much like a renaissance. But since that time, things have gotten progressively more interesting.

The latest example is Microsoft's introduction of IronRuby, which has helped kick off a dynamic language revolution with the .NET Framework. IronRuby itself is based on the popular Ruby language commonly used with the open source Rails framework for Web server development. What's more, Microsoft at the same time introduced the Dynamic Language Runtime (DLR), an extension to the CLR to better support dynamic languages.

Dynamic languages are a rapidly growing category of programming languages that give developers tremendous flexibility by blurring the distinction between code and data, and by allowing new objects and types to be introduced at runtime.

Most important, the DLR has the potential to do for dynamic languages what the CLR did for more traditional languages: help developers with interesting programming languages reach a much broader audience with far less work. In fact, Microsoft is also working on a dynamic version of VB, code-named "VBx," which tries to bring some of these ideas to the millions of active VB developers.

Enjoying the Art
So what does this all mean for the average developer today? General-purpose languages, such as C#, aren't going away. It's precisely their general-purpose nature that makes them a good choice for many tasks. And C# itself is evolving to take on some dynamic characteristics.

But don't walk around with a "VB or C#" mindset. There are alternatives out there that don't require you to learn a new IDE or new database API. Take the time to have someone in your organization learn about Ruby, or Python, or Scheme, even if you have adopted C# as your default language.