Template-Based Code Generation with Apache Velocity, Part 1

Most of the real-world code generators, both commercial and open source releases, use templates and template engines. In this article I'm going to discuss template-based code generation, explain basic concepts related to templates and transformations, and demonstrate the huge benefits they can bring in code generation. Then, I'll implement a simple code generator in Java that uses Velocity, an open source template engine provided by Apache. The generator takes an XML representation of classes and data members and generates the Java code to define them. The generation process will be driven by a template that encapsulates the syntax of the target programming language. You will see how to change the template to generate different types of source code.

Templates and Transformations

Template-based transformations are ubiquitous in the software development world. A wide number of tools use templates to transform data from a format to another. For example, XSLT is the standard way to perform XML document transformations based on templates that are written according to the eXtensible Stylesheet Language. Figure 1 shows the four components involved in a template-based transformation process. They are:

Data Model: Contains data, organized in a specific structure, that have to be transformed.

Template: Formats the data model into the output code. It contains references to entities belonging to the data model.

Template Engine: The application that performs the transformation. It has the input data model and the template, and produces output by replacing the template internal references with real data coming from the model.

Target: The result of the transformation process.

Figure 1. Template-based transformation

Let's consider the following simple example of a data model:

#person.txt
$name=John
$surname=Smith

This is a text file containing a first name and a surname: "John Smith." We want to transform the format of those data according with the following template:

#person.template
Hello, my name is $name and my surname is $surname

The template is also a text file; it contains two references, $name and $surname, to data present into the model.

If the template engine is an application called transform, then you may execute it, passing the data model and the template described above:

> transform person.txt person.template

The result will be:

Hello my name is John and my surname is Smith

The template engine replaced the labels $name and $surname with the real data coming from the model: "John Smith."

Note: The most important aspect about a template-based transformation is that you can change the final representation without touching the application performing the transformation. The only thing you have to modify is the template. For proof, let's consider the following template:

Template-based Code Generation

The code generation process is obviously a transformation process. The data model contains information about the system entities you want to generate, and the template represents the syntax of the target programming language.

Another small example will show the language-portability benefits of code generation. Here is a data model:

#student.txt
$name=Student
$base=Person

It represents a name of a class (Student) and its base class (Person). In order to generate Java code to declare the Student class, you can use this template:

#javaclass.template
public class $name extends $base {
}

The transform application, having those data model and template as input, returns:

public class Student extends Person {
}

which is the definition of the Student class. Thus, we generated code starting from a data model and a template.

To build an interface, we alter the template slightly:

#javainterface.template
public interface $name implements $base {
}

That template drives the engine to produce:

public interface Student implements Person {
}

The engine now generates Student as an interface rather than a class. The good news is that we didn't touch the data model or template engine.

And to show the language portability aspects, we will change the template to build C++:

#cpp.template
class $name : public $base
{
}

The result will be:

class Student : public Person
{
}

Which is the definition of the Student class in C++. Without altering the data model or the template engine, we generated code in another language! That's a very important aspect -- a well-designed, template-based code generator can generate code for different languages by using different types of templates. This is also why a well-designed object model contains no language-specific details.

Apache Velocity

In the previous examples, I used a trivial hypothetical template engine, the transform application, which replaces the $ labels of the template with data coming from the model. Now, it is time to use a real template engine in order to generate code.

The Apache Velocity template engine is a Jakarta open source tool. It works with templates written in VLT (Velocity Template Language), a very simple template-style language. Velocity has been developed mainly to be used with Java. In fact, it uses normal Java classes, associated to a specific context, as the data model. The transformation process has as input a template and a context, and produces output in the format specified by the template. During the transformation process, labels are replaced by data coming from the context.

I have written and tested the sample code with J2SE 1.4.2 and Apache Velocity 1.4. In order to compile and execute it, the CLASSPATH environment variable must include references to velocity-1.4-rc1.jar and velocity-dep-1.4-rc1.jar.

Velocity in Action

The next example is a simple code generator using Velocity, which builds Java classes with data members and related accessor methods (getXxx()/SetXxx()) methods. The data model will be an XML document containing the names of the classes and their attributes. Here is the input model:

The XML structure is very straightforward -- the <Class> element describes a class and the <Attribute> elements a data member. The example defines the Customer class (with code and description) and the Order class (with number, date and customer). The code generator will read that document in order to create the Customer and Order Java classes.

Next we implement the generator. The first step is to design an internal structure to host the XML data. We need two classes, called descriptors, representing <Class> and <Attribute>. The descriptor related to the <Class> element is the following:

It defines the ClassDescriptor class that can host data described into the <Class> element. The name attribute represents the name of the class, while attributes is an ArrayList containing AttributeDescriptor objects. That class, which is the descriptor for the <Attribute> element, is implemented as follows:

The duty of the ClassDescriptorImport class, which extends the SAX default handler, is to transfer data from the XML to the descriptors. As you can see, each time a <Class> element is processed, the class creates a new instance of ClassDescriptor and inserts it into the class's ArrayList. Where the parser processes a <Attribute> element, the class creates a new instance of ClassAttribute and adds it to the parent ClassDescriptor. At the end of the parsing process, classes will contain the descriptors for all of the classes found within the XML document. The getClasses method returns that ArrayList.

At this point, Velocity enters into the picture. The VTL template to generate a Java class with data members and getXxx/setXxx methods is the following:

The $class label is a reference to a ClassDescriptor instance; therefore, $class.Name is the result of the ClassDescriptor.getName invocation (actually, $class.Name is a shortcut for $class.getName() -- in general, a similar shortcut can be applied to all of the methods starting with get). The #foreach statement executes a loop for all of the elements contained in $class.Attributes, which represents all of the AttributeDescriptor instances associated to the current ClassDescriptor. The statements inside of the loop define the data members and the getXxx/setXxx methods according to the values of $att.Name and $att.Type.

The $utility.firstToUpperCase label invokes a user-defined method that returns the same string in input, but with the first character in upper case. That method may be useful, for example, to obtain getNumber from the number data member.

The Code Generator

What is left to implement is the main application. It reads the XML, associates the descriptors to the template by means of the ClassDescriptorImporter, and calls Velocity to perform the transformation.

You can find the complete code of the generator (the ClassGenerator class) among the resources for this article. The most important method of that class is start(). Here is the implementation:

The method has in input the XML and the template filenames. It imports the data by using the xmlReader object, which has been previously associated to the cdImporter object, which is a ClassDescriptorImporter instance. As you can see, by means of the getClasses method, you can get the descriptors of all of the classes that have to be generated (classes ArrayList). The context object created inside of the loop is particularly important because it represents the connection between descriptor instances and template. In fact, the Context.put method associates a Java object to a template label. After executing these statements:

context.put("class", cl);
context.put("utility", utility);

The cl object (current class descriptor) can be referred by the $class label of the template, and the utility object by $utility. That last object is an instance of GeneratorUtility, containing the firstInUpperCase() method.

After creating the context and invoking put, the start method, according to the template file in input, creates a Template object and calls the merge method. It performs the template-driven transformation, taking the data from the context object, and writes the output in the stream referenced by writer.

By launching the code generator with order.xml (data model) and class.vm (template) in input, you obtain Customer.java and Order.java, implemented as follows:

Conclusion

After developing the template-based code generator, you can use it in two ways:

To alter the data model to generate other classes.

To replace the template to produce different language syntax, or even code for a different programming language.

Both of those operations don't require any changes to the generator itself; therefore, a template-based code generator is more flexible than one that embeds the target language syntax.

In part two of this article we'll look at a more complex scenario for template-based code generation. In particular, I'll show how to use templates along with the Internal Model Object generator discussed in [7] below, and a design pattern to decouple that language-independent internal model from a language-dependent Velocity context.

Giuseppe Naccarato
has a degree in computer science and works as software developer for an IT company based in Glasgow (UK). His main interests are J2EE- and .Net-related technologies. Contact Giuseppe at http://www.giuseppe-naccarato.com.