Keynote

Recent years have seen a surge of interest in staging and generative programming, driven by the increasing difficulty of making high-level code run fast on modern hardware. While the mechanics of program generation are relatively well understood, we have only begun to understand how to develop systems in a generative way. The Lightweight Modular Staging (LMS) platform forms the core of a research agenda to make generative programming more widely accessible, through powerful libraries and a growing selection of case studies that illuminate design patterns and crystallize best practices for high-level and effective generative programming. This talk will reflect on the foundations of LMS, on applications, achievements, challenges, as well as ongoing and future work.

Generative Programming I

Mainstream programming languages like Java have limited support for language extensibility. Without mechanisms for
syntactic abstraction, new programming styles can only be embedded in the form of libraries, limiting expressiveness.
In this paper, we present Recaf, a lightweight tool for creating Java dialects; effectively extending Java with new language constructs and user defined semantics. The Recaf compiler generically transforms designated method bodies to code that is parameterized by a semantic factory (Object Algebra), defined in plain Java. The implementation of such a factory defines the desired runtime semantics. We applied our design to produce several examples from
a diverse set of programming styles and two case studies: we define i) extensions for generators, asynchronous computations and asynchronous streams and ii) a Domain-Specific Language (DSL) for Parsing Expression Grammars (PEGs), in a few lines of code.

This paper presents an OO style without classes, which we call interface-based object-oriented programming (IB). IB is a natural extension of closely related ideas such as traits. Abstract state operations provide a new way to deal with state, which allows for flexibility not available in class-based languages. In IB state can be type-refined in subtypes. The combination of a purely IB style and type-refinement enables powerful idioms using multiple inheritance and state. To introduce IB to programmers we created Classless Java: an embedding of IB directly into Java. Classless Java uses annotation processing for code generation and relies on new features of Java 8 for interfaces. The code generation techniques used in Classless Java have interesting properties, including guarantees that the generated code is type-safe and good integration with IDEs. Usefulness of IB and Classless Java is shown with examples and case studies.

Many model-driven development (MDD) tools employ specialized frameworks and modeling languages, and assume that the semantics of models is provided by some form of code generation. As a result, programming against models is cumbersome and does not integrate well with ordinary programming languages and IDEs. In this paper we present MD4J, a modeling approach for embedding metamodels directly in Java, using plain interfaces and annotations. The semantics is provided by data managers that create and manipulate models. This architecture enables two kinds of extensibility. First, the data managers can be changed or extended to obtain different base semantics of a model. This allows a kind of aspect-oriented programming. Second, the metamodels themselves can be extended with additional fields and methods to modularly enrich a modeling language. We illustrate our approach using the example of state machines, discuss the implementation, and evaluate it with two case-studies: the execution of UML activity diagrams and an aspect-oriented refactoring of JHotDraw.

Generative Programming II

Nowadays, many virtual execution environments benefit from concurrency offered by the actor model. Unfortunately, while actors are used in many applications, existing profiling tools are not much effective in analyzing the performance of applications using actors. In this paper, we present a new instrumentation-based technique to profile actors in virtual execution environments. Our technique adopts platform-independent profiling metrics that minimize the perturbations induced by the instrumentation logic and allow comparing profiling results across different platforms. In particular, our technique measures the initialization cost, the amount of executed computations, and the messages sent and received by each actor. We implement our technique within a profiling tool for Akka actors on the Java platform. Evaluation results show that our profiling technique helps performance analysis of actor utilization and communication between actors in large-scale computing frameworks.

It is common practice to bootstrap compilers of programming languages. By using the compiled language to implement the compiler, compiler developers can code in their own high-level language and gain a large-scale test case. In this paper, we investigate bootstrapping of compiler-compilers as they occur in language workbenches. Language workbenches support the development of compilers through the application of multiple collaborating domain-specific meta-languages for defining a language's syntax, analysis, code generation, and editor support. We analyze the bootstrapping problem of language workbenches in detail, propose a method for sound bootstrapping based on fixpoint compilation, and show how to conduct breaking meta-language changes in a bootstrapped language workbench. We have applied sound bootstrapping to the Spoofax language workbench and report on our experience.

In today’s web applications asynchronous requests to remote services using callbacks or futures are omnipresent. The continuation of such a non-blocking task is represented as a callback function that will later be called with the result of the request. This style of programming where the remainder of a computation is captured in a continuation function is called continuation-passing style (CPS). This style of programming can quickly lead to a phenomenon called “call- back hell”, which has a negative impact on the maintain- ability of applications that employ this style. Several alter- natives to callbacks are therefore gaining traction within the web domain. For example, there are a number of frameworks that rely on automatically transforming sequential style code into the continuation-passing style. However, these frame- works often employ a conservative approach in which each function call is transformed into CPS. This conservative approach can sequentialise requests that could otherwise be run in parallel. So-called delimited continuations can remedy, but require special marks that have to be manually inserted in the code for marking the beginning and end of the continuation. In this paper we propose an alternative strategy in which we apply a delimited CPS transformation that operates on a Program Dependence Graph instead to find the limits of each continuation.We implement this strategy in JavaScript and demonstrate its applicability to various web programming scenarios.

Code Generation and Synthesis

We present a method for synthesizing regular expressions for introductory automata assignments. Given a set of positive and negative examples, the method automatically synthesizes the simplest possible regular expression that accepts all the positive examples while rejecting all the negative examples. The key novelty is the search-based synthesis algorithm that leverages ideas from over- and under-approximations to effectively prune out a large search space. We have implemented our technique in a tool and evaluated it with non-trivial benchmark problems that students often struggle with. The results show that our system can synthesize desired regular expressions in 6.7
seconds on the average, so that it can be interactively used by students to enhance their understanding of regular expressions.

This paper introduces typy, a statically typed programming language embedded by reflection into Python. typy features a fragmentary semantics, i.e. it delegates semantic control over each term, drawn from Python's fixed concrete and abstract syntax, to some contextually relevant user-defined semantic fragment. The delegated fragment programmatically 1) typechecks the term (following a bidirectional protocol); and 2) assigns dynamic meaning to the term by computing a translation to Python.
We argue that this design is expressive with examples of fragments that express the static and dynamic semantics of 1) functional records; 2) labeled sums (with nested pattern matching a la ML); 3) a variation on JavaScript's prototypal object system; and 4) typed foreign interfaces to Python and OpenCL. These semantic structures are, or would need to be, defined primitively in conventionally structured languages.
We further argue that this design is compositionally well-behaved. It avoids the expression problem and the problems of grammar composition because the syntax is fixed. Moreover, programs are semantically stable under fragment composition (i.e. defining a new fragment will not change the meaning of existing program components.)

Concern-Oriented Reuse (CORE) proposes a new way of structuring model-driven software development, where models of the system are modularized by domains of abstraction within units of reuse called concerns. Within a CORE concern, models are further decomposed and modularized by features. This paper extends CORE with a technique that enables developers of high-level concerns to reuse lower-level concerns without unnecessarily committing to a specific feature selection. The developer can select the functionality that is minimally needed to continue development, and reexpose relevant alternative lower-level features of the reused concern in the reusing concern's interface. This effectively delays decision making about alternative functionality until the higher-level reuse context, where more detailed requirements are known and further decisions can be made. The paper describes the algorithms for composing the variation (i.e., feature and impact models), customization, and usage interfaces of a concern, as well as the concern's realization models and finally an entire concern hierarchy, as is necessary to support delayed decision making in CORE.

Mobile robots often use a distributed architecture in which software components are deployed to heterogeneous hardware modules. Ensuring the consistency with the designed architecture is a complex task, notably if functional safety requirements have to be fulfilled. We propose to use a domain-specific language to specify those requirements and to allow for generating a safety-enforcing layer of code, which is deployed to the robot. The paper at hand reports experiences in practically applying code generation to mobile robots. For two cases, we discuss how we addressed challenges, e.g., regarding weaving code generation into proprietary development environments and testing of manually written code. We find that a DSL based on the same conceptual model can be used across different kinds of hardware modules, but a significant adaptation effort is required in practical scenarios involving different kinds of hardware.

Configurable systems typically use #ifdefs to denote variability. Generating and compiling all configurations may be time-consuming. An alternative consists of using variability-aware parsers, such as TypeChef. However, they may not scale. In practice, compiling the complete systems may be costly. Therefore, developers can use sampling strategies to compile only a subset of the configurations. We propose a change-centric approach to compile configurable systems with #ifdefs by analyzing only configurations impacted by a code change (transformation). We implement it in a tool called CHECKCONFIGMX, which reports the new compilation errors introduced by the transformation. We perform an empirical study to evaluate 3,913 transformations applied to the 14 largest files of BusyBox, Apache HTTPD, and Expat configurable systems. CHECKCONFIGMX finds 595 compilation errors of 20 types introduced by 41 developers in 214 commits (5.46% of the analyzed transformations). In our study, it reduces by at least 50% (an average of 99%) the effort of evaluating the analyzed transformations by comparing with the exhaustive approach without considering a feature model. CHECKCONFIGMX may help developers to reduce compilation effort to evaluate fine-grained transformations applied to configurable systems with #ifdefs.

Today’s competitive marketplace requires the industry to understand unique and particular needs of their customers. Product line practices enable companies to create individual products for every customer by providing an interdependent set of features. Users configure personalized products by consecutively selecting desired features based on their individual needs. However, as most features are interdependent, users must understand the impact of their gradual selections in order to make valid decisions. Thus, especially when dealing with large feature models, specialized assistance is needed to guide the users in configuring their product. Recently, recommender systems have proved to be an appropriate mean to assist users in finding information and making decisions. In this paper, we propose an advanced feature recommender system that provides personalized recommendations to users. In detail, we offer four main contributions: (i) We provide a recommender system that suggests relevant features to ease the decision-making process. (ii) Based on this system, we provide visual support to users that guides them through the decision-making process and allows them to focus on valid and relevant parts of the configuration space. (iii) We provide an interactive open-source configurator tool encompassing all those features. (iv) In order to demonstrate the performance of our approach, we compare three different recommender algorithms in two real case studies derived from business experience.

The development of variable software, in general, and feature models, in particular, is an error-prone and time-consuming task. It gets increasingly more challenging with industrial-size models containing hundreds or thousands of features and constraints. Each change may lead to anomalies in the feature model such as making some features impossible to select. While the detection of anomalies is well-researched, giving explanations is still a challenge. Explanations must be as accurate and understandable as possible to support the developer in repairing the source of an error. We propose an efficient and generic algorithm for explaining different anomalies in feature models. Additionally, we achieve a benefit for the developer by computing short explanations expressed in a user-friendly manner and by emphasizing specific parts in explanations that are more likely to be the cause of an anomaly. We provide an open-source implementation in FeatureIDE and show its scalability for industrial-size feature models.

A software product line comprises a family of software products that share a common set of features. It enables customers to compose software systems from a managed set of features.
Testing every product of a product line individually is often infeasible due to the exponential number of possible products in the number of features.
Several approaches have been proposed to restrict the number of products to be tested by sampling a subset of products achieving sufficient combinatorial interaction coverage.
However, existing sampling algorithms do not scale well to large product lines, as they require a considerable amount of time to generate the samples.
Moreover, samples are not available until a sampling algorithm completely terminates.
As testing time is usually limited, we propose an incremental approach of product sampling for pairwise interaction testing (called IncLing), which enables developers to generate samples on demand in a step-wise manner.
Furthermore, IncLing uses heuristics to efficiently achieve pairwise interaction coverage with a reasonable number of products.
We evaluated IncLing by comparing it against existing sampling algorithms using feature models of different sizes.
The results of our approach indicate efficiency improvements for product-line testing.

Testing a software product line such as Linux implies building the source with different configurations. Manual approaches to generate configurations that enable code of interest are doomed to fail due to the high amount of variation points distributed over the feature model, the build system and the source code. Research has proposed various approaches to generate covering configurations, but the algorithms show many drawbacks related to run-time, exhaustiveness and the amount of generated configurations. Hence, analyzing an entire Linux source can yield more than 30 thousand configurations and thereby exceeds the limited budget and resources for build testing.
In this paper, we present an approach to fill the gap between a systematic generation of configurations and the necessity to fully build software in order to test it. By merging previously generated configurations, we reduce the number of necessary builds and enable global variability-aware testing. We reduce the problem of merging configurations to finding maximum cliques in a graph. We evaluate the approach on the Linux kernel, compare the results to common practices in industry, and show that our implementation scales even when facing graphs with millions of edges.

Collection data structures in standard libraries of programming languages are designed to excel for the average case by carefully balancing memory footprint and runtime performance. These implicit design decisions and hard-coded trade-offs do constrain users from using an optimal variant for a given problem. Although a wide range of specialized collections is available for the Java Virtual Machine (JVM), they introduce yet another dependency and complicate user adoption by requiring specific Application Program Interfaces (APIs) incompatible with the standard library.
A product line for collection data structures would relieve library designers from optimizing for the general case. Furthermore, a product line allows evolving the potentially large code base of a collection family efficiently. The challenge is to find a small core framework for collection data structures which covers all variations without exhaustively listing them, while supporting good performance at the same time.
We claim that the concept of Array Mapped Tries (AMTs) embodies a high degree of commonality in the sub-domain of immutable collection data structures. AMTs are flexible enough to cover most of the variability, while minimizing code bloat in the generator and the generated code. We implemented a Data Structure Code Generator (DSCG) that emits immutable collections based on an AMT skeleton foundation. The generated data structures outperform competitive hand-optimized implementations, and the generator still allows for customization towards specific workloads.

Most software systems are designed to provide custom functionality using configuration options.
Testing such systems is challenging as running tests of a single configuration is often not sufficient, because defects may appear in other configurations.
Ideally, all configurations of a software system should be tested, which is usually not applicable in practice due to the combinatorial explosion with respect to the configuration options.
Multiple sampling strategies aim to reduce the set of tested configurations to a feasible amount, such as T-wise sampling, random configurations, and user-defined configurations.
However, these strategies are often not applied in practice as they require manual effort or a specialized testing framework.
Within our tool FeatureIDE, we integrate all aforementioned strategies and reduce the manual effort by automating the process of generating and testing configurations.
Furthermore, we provide support for unit testing to avoid redundant test executions and for variability-aware testing.
With this extension of FeatureIDE, we aim to make recent testing techniques for configurable systems applicable in practice.

Testing and Verification

Regression testing is a form of software quality assurance (QA) that involves comparing the behavior of a newer version of a software artifact to its earlier correct behavior, and signaling the QA engineer when deviations are detected. Given the large potential in automated generation and execution of regression test cases for business process models in the context of running systems, powerful tools are required to make this practically feasible, more specifically to limit the potential impact on production systems, and to reduce the manual effort required from QA engineers.
In this paper, we present a regression testing automation framework that implements the capture & replay paradigm in the context of BPMN 2.0, a domain-specific language for modeling and executing business processes. The framework employs parallelization techniques and efficient
communication patterns to reduce the performance overhead of capturing. Based on inputs from the QA engineer, it manipulates the BPMN2 model before executing tests for isolating the latter from external dependencies (e.g. human actors or expensive web services) and for avoiding undesired side-effects. Finally, it performs a regression detection algorithm and reports the results to the QA engineer.
We have implemented our framework on top of a BPMN2-compliant execution engine, namely jBPM, and performed functional validations and evaluations of its performance and fault-tolerance. The results, indicating 3.9% average capturing performance overhead, demonstrate that the implemented framework can be the foundation of a practical regression testing tool for BPMN 2.0, and a key enabler for continuous delivery of business process-driven applications and services.

Today's programmers face a false choice between creating software that
is extensible and software that is correct. Specifically, dynamic
languages permit software that is richly extensible (via dynamic code
loading, dynamic object extension, and various forms of reflection),
and today's programmers exploit this flexibility to "bring their own
language features" to enrich extensible languages (e.g., by using
common JavaScript libraries). Meanwhile, such library-based language
extensions generally lack enforcement of their abstractions, leading
to programming errors that are complex to avoid and predict.
To offer verification for this extensible world, we propose online
verification-validation (OVV), which consists of language and VM
design that enables a "phaseless" approach to program analysis, in
contrast to the standard static-dynamic phase distinction. Phaseless
analysis freely interposes abstract interpretation with concrete
execution, allowing analyses to use dynamic (concrete) information to
prove universal (abstract) properties about future execution.
In this paper, we present a conceptual overview of OVV through a
motivating example program that uses a hypothetical database library.
We present a generic semantics for OVV, and an extension to this
semantics that offers a simple gradual type system for the database
library primitives. The result of instantiating this gradual type
system in an OVV setting is a checker that can progressively type
successive continuations of the program until a continuation is
fully verified. To evaluate the proposed vision of OVV for this
example, we implement the VM semantics (in Rust), and show that this
design permits progressive typing in this manner.

The intensive use of generative programming techniques provides an elegant engineering solution to deal with the heterogeneity of platforms and technological stacks. The use of domain-specific languages for example, leads to the creation of numerous code generators that automatically translate highlevel system specifications into multi-target executable code. Producing correct and efficient code generator is complex and error-prone. Although software designers provide generally high-level test suites to verify the functional outcome of generated code, it remains challenging and tedious to verify the behavior of produced code in terms of non-functional properties. This paper describes a practical approach based on a runtime monitoring infrastructure to automatically check the potential inefficient code generators. This infrastructure, based on system containers as execution platforms, allows code-generator developers to evaluate the generated code performance. We evaluate our approach by analyzing the performance of Haxe, a popular high-level programming language that involves a set of cross-platform code generators. Experimental results show that our approach is able to detect some performance inconsistencies that reveal real issues in Haxe code generators.