How to Integrate Models And Code

While we all create models in one form or another, their combination with code has been challenging. As a result, models are usually thrown away once the implementation has progressed. The reason is partly in the modeling languages used and partly in the modeling tools. We describe proven practices for success in integrating models and code.

Don’t model the code

One clear reason for the trouble people have in aligning models and code is the modeling languages. UML is a clear example of this: class diagrams, the most used diagram type of UML, are used to specify exactly the same things that we have in the code too. Maintaining the same thing in two places, the code and the model, will obviously lead to trouble. Even worse, keeping class diagrams and code in sync is made harder because their mapping is not one-to-one. For example, we don’t have associations in a programming language or a switch-case structure in class diagrams - or in any other UML diagram. As a result, models become outdated or are simply thrown away at some point during the implementation. Code is king.

Model the problem

Domain-specific languages provide a solution. They raise the level of abstraction beyond code, yet can generate full code, removing the need to specify the same thing in both the models and the code. This is nothing new: we saw something similar when we moved from assembler to third-generation languages like C or Java. We did not seek to express the same things in both assembler and C, or to apply roundtrip engineering between the two languages. Today any attempt to do new development in that way would be regarded as comical. The reason compilers were successful was that they automated the mapping from a higher level specification to a lower level one, not vice versa. Domain-specific languages and generators offer the same benefits.

Sample domain-specific languages

So what do domain-specific models look like? Well, it depends on the domain. If we are designing interlocking systems for railways the models could look like this:

(Click on the image to enlarge it)

Figure 1. DSL for train interlocking

This language uses concepts that are familiar to all companies making railway systems: tracks, switches, signals etc. While we don’t see any programming concepts in the model, these models are still formal. More importantly they allow us to generate the interlocking code directly from the model. If the interlocking code has to change it is because the tracks, switches etc. have changed, so rather than working on the code level we can work directly on this higher level. With this DSL, the model is king.

Often a single DSL is not enough. Applications are large, they have various aspects, different developers have different views, and so no single modeling language can specify it all. In such cases we can use multiple, yet integrated, languages. For example, if we are building heating systems we can have one DSL to describe structure and another for behavior. The figure below shows these two languages: the diagram at the back shows the structure of the system: there are pumps, valves, sensors, and other instruments connected with various kinds of pipes.

(Click on the image to enlarge it)

Figure 2. DSLs for heating application: structure and behavior

The diagram at the front shows a model describing the behavior of part of the system, a pump controller. The behavior is described as a state machine enriched with the concepts of this problem domain: the data from sensors and other instruments are used as conditions and triggers for the transition. The language also includes actions for controlling the instruments, like turning the pumps on and off.

These two languages are integrated, allowing the elements of the system (like pumps and sensors) to be specified from different angles. Again, the models above do not use programming concepts but aim to raise the level of abstraction closer to the problem being solved. Yet they are not vague, hand-waving pictures, but formal and precise so that we can generate code from them.

Integration among the tools

Typically, not all aspects of the application can be described in models alone. There are 3rd party libraries, frameworks and legacy code, which maybe do not need to be edited at all. Similarly there can be unique business code that is not easy to describe in models - or for which it does not make sense to create a DSL yet. This will be edited in a normal IDE, which may also be used for building and debugging. All this calls for integration between modeling tools and IDEs.

In the past the difficulties on integrating models and code lay in the tools. They have been closed, in particular on modeling tool side, making it difficult to access the models and generators from other tools. Recently tools have evolved from closed environments to open ones providing various integration mechanisms among the different platforms: they have programmable interfaces, import/export formats, and enable plug-ins for integration. In particular the most powerful modeling tools are flexible, enabling engineers to easily modify both the modeling languages as well as their related generators.

Integration example

Let’s look at tool integration with an example of Android application development. We will use Eclipse as the coding IDE and MetaEdit+ as the modeling tool, but the principles are the same for any tools.

With Eclipse and its Android Development plug-in we can access the libraries and frameworks, write unique business logic, and manage the build process. To link the IDE with the modeling tool, MetaEdit+ provides an Eclipse plug-in, Graph Browser (bottom left view in Eclipse), which shows the various models from MetaEdit+. With the Graph Browser, the developer can inspect models and their hierarchies, and create and open models directly from the IDE. The modeling tool then provides editors to work with the models in their domain-specific modeling languages. Since the level of abstraction is raised the models do not deal with implementation details, such as if statements, Java classes, or the Android framework.

(Click on the image to enlarge)

Figure 3. Eclipse with a plug-in for modeling tool integration.

The figure below shows such a high-level model opened from the IDE. In this example, the domain is digital watches and the modeling language uses the concepts of alarm, icon, button, time etc. Any new application functionality will be designed by using these domain-specific concepts – with the implementation details hidden. The high-level models also offer better support for analyzing and understanding the problem domain, communicating with customers and other team members, producing testing data, documentation etc.

(Click on the image to enlarge it)

Figure 4. Accessing models from IDE

Generating code and project data, and integrating with the existing code

Most importantly, the models are not used only for analysis and design but also to generate the application code. Since the code generator is defined together with the modeling language, it knows how to extract the information in the models and produce the code, linking it with the framework and other existing code. While the code generators are defined in the modeling tool, they can be called directly from the Eclipse plug-in. The plug-in automatically imports the generated code into the Eclipse project, and builds and runs it in the Android emulator (see Figure 5).

(Click on the image to enlarge it)

Figure 5. The generated code is run in the Android emulator

Since this workflow is completely automated the path from high-level models to a running application is seamless for the application developer. Models are now the source of the application and its logic, and the generators automatically produce Java code that expresses the same things at a lower level. Eclipse links the generated code with fixed framework code for time calculations etc., and the Android framework for the UI widgets and operating system.

Debugging with models

Use of both models and code also raises the question of debugging. While we can always use the IDE’s debugging functionality, it usually does not make sense to debug the generated code itself. The developer is more familiar with the model than the code, and if a problem is found we want to correct it in the model not the code.

Generated code does not have typos, off-by-one errors, or other kind of errors that we are used to finding in manually written code. Instead we can use models to debug the functionality at a higher level: is the application doing what it was designed to do? Consider the example below: as the application is executed in a debug session, the model is animated by highlighting elements in models that are active in the code. In the screenshot below the ‘Running’ state has just been entered, and so it is highlighted with a thick red border.

(Click on the image to enlarge it)

Figure 6. Debugging at model level

While breakpoints and other debugging constraints can be set in the IDE at the code level, it is also possible to add such functionality to the DSL. For example, the modeler could choose an element like ‘Running’ in the model and set it as a breakpoint, and when execution reached that element the generated code would stop running and raise a breakpoint interrupt signal.

Use cases for integration

The integration of models and code offers ideal support for team work. Using the Android application development as an example, there can be different, yet integrated, modeling languages for different tasks, e.g. for interaction specialists and application developers. Interaction specialists would use a ‘navigation DSL’ to design user interface logic and page navigation. Generators would produce prototypes from these models for concept testing, or list all the user interface text in the format needed by the localization team.

Application developers would use otherricher languages to add behavior details for the interaction designs. Since the DSLs are integrated, application developers can utilize the UI definitions and extend them if needed. The final code is then generated from both kinds of DSLs into the same Eclipse project.

Because the code is taken care of by the generators, interaction specialists can use the familiar domain concepts in the modeling tool, rather than having to use an IDE. If the modeling tool uses a repository rather than simple files, it can also provide support for simultaneous team work: multiple interaction specialists and developers can work with the same models concurrently, supporting tight collaboration and a fast feedback loop.

Final remarks

Tools that are open for integration and enable support for domain-specific languages offer an ideal way to integrate models and code. Modelers can work directly with domain concepts in the modeling tools, and use the IDE for the automated build process and to work on the parts of the system that are still hand-coded. Since the modeling tool and IDE are intelligently integrated, the overall development process becomes easier and faster – in stark contrast to old round-trip engineering attempts, which made the process more complex and less reliable.

This integration approach has low coupling between the modeling tool and the IDE, and high cohesion of the modeling tool functionality. Other integration approaches, which rely on the modeling tool being implemented within the IDE as several different plug-ins, suffer from problems of incompatible version requirements between the plug-ins and the IDE. The low coupling and high cohesion also mean that the integration is not limited to Eclipse: MetaCase have implemented the same functionality in a Visual Studio extension. The plug-ins are open source and come with guides for using and modifying them for your own integration needs; customers have already created their own plug-ins for other IDEs.

Rather than the outdated “code is king”, or the lumbering CASE tool “model is king”, integration like this means “the developer is king”: build each part of your application in the technology, language and tools best suited to that part, with lightweight integration bringing all this together into a cohesive whole.

Resources

About the Author

Juha-Pekka Tolvanen, PhD, is the CEO of MetaCase. He has been involved in domain-specific languages, model-driven approaches and related tools since 1991. He has acted as a consultant worldwide on modeling language development, authored a book (Domain-Specific Modeling, Wiley 2008) and written over 70 articles in journals and conferences.

In my mind, UML models are a DSL for writing code. Models and code, in all forms, are a representation of the same conceptual paradigm entity, shown from different viewpoints. Personally, I think modern DSL's are an excellent addition and long overdue, but the value of modelling the statics of a system (the entities, their associations) and the behaviour (how they collaborate to get a job done) is definitely nothing new. It formed the base of the GoF patterns series which introduced the logical separation between the behaviour of a system and its structural elements and of course, for those of us with somewhat mathematical backgrounds, the idea of statics and dynamics is very old hat indeed.

The idea of an 'association' not existing in code is only correct at a semantic level (which is still very important, as explained later) as obviously, any use of a class by another class not falling in to a generalisation (inheritance), a composition (contains at class level) or an aggregation (shared object reference) constitutes an association (such as local variable declarations, which I appreciate are 'contained' within methods but don't compose the class per se). It is the meaning and the translation of that meaning that we keep getting wrong in the IT industry. The mapping of one model type directly to another model type is, as you say, ludicrous. A UML model for example, needs to go through a higher level meta-model (mapping an association to some form of interaction) before being translated into code, which is where the comparison with a compiler really comes in. This is because a compiler translates to a 'meta-model' (in this case, maybe a syntax tree, or validates through some other representation such as BNF or whatever) and this is what is translated into code. The same is true of Java (into bytecode) and .NET (into MSIL and even CodeDOM). You see the same idea in different guises in enterprise integration (using the normaliser pattern) as well as a number of other field within IT and outside.

It has been the holy grail of MDA for a long time but I think the approach has traditionally been flawed and hasn't really appealed to those whos king is not the model. In any case, to be mildly facetious, whether the king is code or model, there is only one king. We need to stop worshipping the various kings that we do and realise that the one king (the concept of the domain) is everyone's common king. Whether we show them through code or model is immaterial to the concept itself, it just helps us realise that concept.

Thanks for the feedback. Rather than having integration alone, a key is to decide what to model and what to code as it does not make sense to visualize in the models what is already presented in code. A hands-on session recording at www.infoq.com/presentations/MDD-MetaEdit shows how to define the modeling languages and part 2 at www.infoq.com/presentations/MDD-MetaEdit-2 shows how to raise the level of abstraction: how to do less work to get the same code/functionality.