It is often the case that software producers need to create different variants of their system to cater for different customer requirements or hardware specifications.

While there are systematic product line engineering methodologies that support variability (e.g., preprocessor, deltas, aspects, modules), software variants are often developed using clone-and-own (aka copy-paste) since it is a low-cost mechanism without a steep learning curve [1].

Recent collaboration tools such as Github and Bitbucket have made this clone-and-own process more systematic by introducing fork-based development. In this development process, users fork a repository, which creates a traceability link between the two repositories, make changes on their own fork and push changes back to the repository from which they forked (upstream) via pull-requests.

Combining this with a powerful version control system such as Git, better and more efficient variant management can be achieved. The question is how is this done in practice and what difficulties does this process entail? To answer this question, we analyzed Marlin, a 4-year old Github-hosted project that combines clone-and-own with traditional variability mechanisms.

Marlin is a firmware for 3D printers written in C++ that employs variability both through its core code that uses preprocessor annotations, but also through its clones. Started in August 2011, it was forked by more than 2400people, many of which contribute changes. This is unusual for such a small, recent project in a relatively narrow and new domain.

We looked at Marlin to understand how forking supports the creation of product variants

We sent a survey to 336 fork owners , and got answers from 57 of them. The questions we asked included (1) the criteria they use for integrating changes from the main code base (i.e., upstream) and (2) how they deal with variability. We use these answers to gain a perspective from the user side on the development and maintenance of the forks.

To merge or not to merge?

Fork owners: Most fork owners indicated that it is difficult to merge upstream changes, because their firmware becomes unstable and produces undesired results. In addition, configuring the software is meticulous due to the large number of features and parameters that need to be properly configured. A slight change in these parameters has consequences for the end-user. Therefore, many of the fork owners rarely sync with the main Marlin branch. Another point they made is that only some of the changes are interesting to them, and even though Git allows to selectively apply patches from upstream (cherry-picking), it is still difficult to select what should be merged from upstream and what not.

Marlin Maintainers: If we look from the Marlin maintainers perspective, more than 50% of the main Marlin branch commits came from forks. This suggests that merging changes in the opposite direction is more common. Forks allow users to innovate and bring new ideas to Marlin. However, integrating cloned variants is a difficult task for the maintainers. It is especially problematic when forks introduce new features and want to integrate them into the main codebase, as it may break other people’s variants. To handle this issue, all new feature and variant contributions are not integrated until they are introduced using preprocessor directives. These directives have to be disabled by default in the main configuration of the firmware. This is an attempt to lower the probability of affecting someone else’s variant, with the added side benefit of increasing the stability of the main codebase.

Additionally, maintainers need to ensure that the quality of the clone is within their requirements (implementation, tests, documentation, style adherence), and that they can handle the maintenance and evolution of that variant. In Marlin, this is important as there are many hardware devices that can be used, making it complicated for maintainers to test new variants (which very often means they need to run the printer and print an object). Here is where the community goes the extra mile, with many users using their hardware and printers to test different variants and reporting any issues.

An important aspect is that when changes from a fork get integrated upstream, this fork becomes more popular and visible. For example, in a variant (jcrocholl’s deltabot) there was only one pull-request accepted by the fork owner. When the variant got integrated, there were many more issues and pull-requests that dealt with that variant, and many more changes for that variant got accepted. Finally, jcrocholl’s maintenance efforts were reduced as he did not have to keep in sync with the main Marlin anymore, and push his changes back. Any change related to deltabot was done directly on the main Marlin repository.

Forking vs Preprocessors

In embedded systems, computational resources are limited. Many respondents from our survey explain that memory limitations (Marlin runs on 8-bit Atmega microcontrollers, that have between 4kB and 256kB of flash memory) pushed them to use preprocessor directives to allow excluding code at compilation time, and to experiment with different ideas. On the other hand, we definitely see that forking is the way to go when fast prototyping is needed. It is also useful when changes are not relevant to the other involved people, or just to store configurations of a variant. The latter is heavily used in Marlin (around 200 forks only store configurations of the firmware), and it is a very light and efficient mechanism.

Lessons learnt

Based on the above survey results as well as more detailed analysis of the Marlin project structure and development history, we derive the following guidelines for fork-based development.

Fork to create variants and to support new configurations. It is easy, efficient and lightweight, and using Github’s forking, we get traceability for free.

Use preprocessor annotations for flexibility and to tackle memory constraints when needed, both in a fork and in the main branch.

Keep track of variants by adding a description for each fork created and maintain that description.

Merge upstream often to reduce maintenance and evolution efforts of cloned variants.

Recent tools and techniques (Git, Github, forking) can deal to some degree with the complex task of variant development. With the large adoption of Github, it seems that we are already heading towards that direction. Adopting new tools and techniques is a long process, and there are still many challenges that lie ahead, but we are one step closer in understanding how to offer better tool support for variant management.

Wednesday, December 9, 2015

In one of our research projects we looked at how reference architectures are used in agile projects. Software engineers often use existing reference architectures as “templates” when designing systems in particular contexts (such as web-based or mobile apps). Reference architectures (from a third party or designed in-house) provide architectural patterns (elements, relationships, documentation, etc.), sometimes partially or fully instantiated, and therefore allow us to reuse design decisions that worked well in the past. For instance, a web services reference architecture may describe how a web service is developed and deployed within an organization’s IT ecosystem. On the other hand, industry practice tends towards flexible and lightweight development approaches1 and even though not all organizations are fully agile, many use hybrid approaches2. Since reference architectures shape the software architecture early on, they may constrain the design and development process from the very beginning and limit agility. Nevertheless, in case studies that we conducted with software companies that use Scrum as their agile process framework, engineers reported extra value when using reference architectures. That extra value goes beyond the typical reasons for using reference architectures (such as being able to use an architecture template, and supporting standardization and interoperability). This additional value comes from three things: architectural focus, less knowledge vaporization, and team flexibility.

Architectural focus. We found that reference architectures inject architectural thinking into the agile process. Architectural issues often get lost in agile projects and the architecture emerges implicitly. A reference architecture supports the idea of a system metaphor in agile development. The clear picture of core architectural issues helps communicate the shared architectural vision as a “reference point” within agile teams across sprints. Since reference architectures already confine the design space, they also help balance the effort spent on up front design. In fact, we have observed that this outweighs the effort required to learn about a reference architecture. Furthermore, reference architectures provide a “harness” for agile teams to try out different design solutions. This helps reduce the complexity of the design space and potentially limits the amount of architectural refactoring.

Less knowledge vaporization. Agile promotes working products over documentation. Reference architectures usually come with supporting artefacts and documentation, so large parts of the architecture don’t need to be documented and maintained separately. For example, if projects use NORA (a reference architecture for Dutch e-government), software engineers can focus on documenting product or organization-specific features rather than the whole architecture and all design decisions. In the example of NORA, this would include features and architecture artefacts implemented in individual municipalities.

Team flexibility. Reference architectures facilitate communication within and across Scrum teams since there is a shared understanding of common architectural ideas. We have found that this not only benefits individual teams, but also allows engineers to move across different projects and / or teams, and to work on more than one project at the same time (as long as the same reference architecture is used). This facilitates cross-functional teams, as promoted in agile practices.

The above list includes preliminary findings and there are certainly other benefits (and benefits related to software architecture in general), depending on a particular project situation. We also report more details in our paper “Understanding the Use of Reference Architectures in Agile Software Development Projects” published at the 2015 European Conference on Software Architecture.

Wednesday, December 2, 2015

We will expand on this story in an upcoming column in our Software Impact Series by which time we should know more details, but in the light of the rapidly unfolding story at Germany's giant automotive company, VW, we will add a new dimension to our original questions “Software: what's in it and what's it in ?” [1] by asking “What's hidden in it and how many people knew ?”.

At the time of writing, it would appear that aside from the normal and burgeoning functionality in the tens of millions of lines of code embedded in modern automotives (for example, Mossinger [2]), there may in some cases be code intended to 'deceive'.The question, of course, is when does a feature cross the line from what lawyers refer to as harmless “advertiser's puff”, all the way to deceit?

In the case of VW, the change appears to have been tiny – just a few lines of code in what might be millions.In essence, the software allegedly monitored steering movement whilst running.On a test harness, the car wheels move but the steering wheel doesn't, in contrast to normal running where both are continually in motion.By doing this, the software could detect when the car was in test mode and therefore, control the degree to which catalytic scrubbing was done on the emissions.Catalytic scrubbers inject a mixture of urea and water into the diesel engine emissions converting harmful nitrogen oxides into the more benign molecules nitrogen, oxygen, water and small amounts of carbon dioxide.The trade-off in a diesel engine is quite simply one of emission toxicity against car performance.The software, now known as a 'defeat device' simply turned up the efficiency of the catalytic converters when it thought the car was under test.To date, this is now believed to have been embedded in around 11 million VW cars and some 2 million Audis.

A cynical observer would claim that if somebody can get away with something, then they will but did the engineers responsible really believe that such a device would never be found?After all, unless you knew what you were looking for, finding it by inspecting the code is comparable to finding a needle in a haystack, and even if you did know, finding your way around a giant software system is not for the faint-hearted.However, you cannot defeat the laws of physics, or in this case, chemistry.The VW defeat device was basically discovered by independent monitoring of exhaust emissions with glaring differences being found between what was observed in normal running and what was being claimed so it seems naïve to think it would not be discovered eventually.In which case, did the engineers responsible think that people wouldn't mind or that the financial benefit of selling more cars would outweigh any potential downside ?If they did, they are likely to be in for an unpleasant surprise with VW already setting aside several billion dollars to deal with potential claims.

As of 30-Sept-2015 when this part was written, it appears that over 1 million cars and vans could be affected in the UK, Europe's second biggest diesel user after Germany, but VW do not know.In fact, they do not appear to know if the software is present or if so, whether it is activated, and nobody seems to have considered the possibility of breaking something else if the software is to be removed or even simply deactivated during software recall, due to unintentional side-effects.These can occur through, for example, shared global variables, or one of a number of mechanisms which will be familiar to professional software engineers.In short, it's removal could introduce one or more defects.

Speaking of defects, let's raise an interesting question.Is this better or worse than releasing inadequately tested automotive software in general?For example, one of the more recent examples of this is the Toyota unintended acceleration bug [3].They are not alone as there have been numerous recalls in the automotive industry due to software defects.When a car manufacturer releases such a bug whilst advertising how safe their cars are, are they not being similarly misleading?For example, contrast the following two more factually appropriate sentences to cover theseeventualities.

“We have adjusted the catalytic converter to behave more efficiently if you drive at constant speed without moving the steering wheel, so your emissions will be much lower.If you depart from this, you will get better performance but your emissions will be very considerably more noxious.”

and

“We believe that software innovation is vital in automotive development, however, the systems we release to you are so complicated that they will have defects in them which might sometimes prejudice your safety.Most of the time, however, we believe they will not.”

Would you still buy the car?It could, of course, be argued that these questions arise from different ethical viewpoints but any software engineer worth their salt will know that the chances of releasing a complicated defect-free software system are effectively negligible.

We await the answers to several obvious questions.Are any other companies doing this, or if we take a more cynical standpoint, how many?If not, are they using software practices almost as dubious?How do we decide what is reasonable given the extraordinary ability of software to give hardware its character?The CEO has already been replaced but what will happen to the engineers and the managers who were responsible?

We look forward to revisiting the story as more information comes to light.

Contextual Note: The Impact series in IEEE Software describes the impact of software on various industries. Around 30 columns have been published since 2010 by senior technical and business managers from companies companies such as Oracle, Airbus (software in the 380), Hitachi, MicroSoft and Vodafone. Others columns were provided by Cern (software behind the Higgs Boson discovery) and JPL (software in the Mars Lander). Michiel van Genuchten and Les Hatton are editors of the Impact column. The Impact column from Jan-Feb 2016 will contain and updated and more extensive version of this blog.

Sunday, November 29, 2015

Back in year 2000, during Christmas, I had some time to play. Some days earlier, by chance, I had read "Estimating Linux's Size", a study on the number of lines of code of all the software in Red Hat 6.2, one of the most popular Linux-based distribution of the age. I had noticed that the tool used in the study, sloccount, was FOSS (free, open source software), so I could use it in my playground. For some days, my laptop was busy downloading all of Debian GNU/Linux 2.2 (Potato) source code, running sloccount on it, and producing all kinds of numbers, tables, and charts. At that time Debian was one of the largest, if not the largest, coordinated collections of software ever compiled. That fun lead to a series of paper, starting with "Counting potatoes: the size of Debian 2.2", and to a pivot in my research career that put me in the track of mining software repositories.
This personal story is an illustration of two emerging trends that proved to be game-changers: the public availability of information about software development, including the source code, and the availability of FOSS tools to retrieve and analyze that information. Both were enabled by FOSS, but they have worked in very different, yet complementary, ways.

FOSS as a matter of study

Everybody knows nowadays that FOSS projects release publicly the source code they produce. This has tremendously lowered the barriers to study software products, when that study requires access to their source code. No longer it is compulsory to have special agreements with software companies to research in this area. For the first time in the software history, it is possible and relatively easy to run comparative studies on large and very large samples of products.
Of course, the impact of these new opportunities has been gradual. As FOSS has become more and more relevant, more "interesting", industry-grade products have been available for studying. With time, researchers have developed new techniques and tools, and even a kind of a new mindset, for analyzing this new, yet very rich, corpus of cases.
Among FOSS projects, there is a large subset using an open development model. They keep public repositories for source code management, issue tracking, mailing lists and forums, etc. For them not only the source code, but also a lot of very detailed information about the development process is available. From the researcher point of view, this means a heaven of data to explore.
Already in the late 1990s some researchers started to take advantage of this new situation, but it was in the early 2000s when this new approach started to take shape, with works such as "Evolution in open source software: a case study", "Results From Software Engineering Research Into Open Source Development Projects Using Public Data","A Case Study of Open Source Software Development: The Apache Server", and "Open source software projects as virtual organizations".
Soon, specialized venues that dealt mostly with the study of data from software development repositories emerged. Among them, the International Working Conference on Mining Software Repositories, held annually since 2004, was one of the first and is today probably the most well known. During the last decade, this research approach has permeated all major conferences and journals in software engineering.

FOSS tools and public datasets as enablers of research

The availability of FOSS tools has been an enabler in many research fields, and software engineering has been no exception. The availability of efficient and mature FOSS databases, analysis tools, etc., allows any researcher, even with very modest budgets, to carry on ambitious studies with very large datasets. In addition, more and more tools specifically developed to analyze software or software repositories are available as FOSS. Many researchers are taking as customary to publish as FOSS the tools they use to produce the results shown in the data, letting others modify or adapt them to extend the research results. Some of them are useful for practitioners too.
For example, several FOSS tool sets exist to retrieve information from software repositories and organize it in databases, ready for analysis, such as MetricsGrimoire. They allow researchers to focus on the analysis of the data, and not on the retrieval, a task which may be very effort-consuming, specially when a large number of projects are involved. Practitioners are using them as well, to track KPIs (Key Performance Indicators) of their projects.
One step further, in some cases the datasets resulting from large data retrieval efforts are made public, letting researchers directly jump to analyze them. This is for example the case of the Boa Infrastructure, which allows to execute queries against hundreds of thousands of software projects very efficiently, or FLOSSMole, a repository of FOSS project data coming from several software development forges.

The impact on software engineering research

These two trends are changing large areas in the field of software engineering research, by allowing researchers to produce results in ways closer to the scientific method than before. With relatively little effort, they can start from solid data, which is becoming more and more accessible, and rely on mature, adaptable tools, to produce results that take into account a large diversity of cases. Some concrete impacts are:

Availability of datasets. Researchers can work on the same dataset with different techniques, which eases the comparison of results, and allows to determine to which extent a study advances the state of the art. They also allow researchers to focus on analysis, and not on the assembly of the dataset itself, which is complex, error-prone, and effort-consuming.

Reproducibility of the studies. When the tools used by researchers are FOSS, and the datasets are public, reproducing previous results becomes possible and even easy. This is already improving the chances of validating results, which is fundamental to advance on solid ground.

Incremental research. Now it is much easier to build on the shoulders of giants, by reusing FOSS tools and datasets produced by other research teams. Researchers no longer have to start from the ground up, they can incrementally improve precious results.

In short, FOSS is making empirical software research much easier. Some clear examples are the advances in the areas of detection of code clones, the impact of different release models, or the limits to software evolution. The next years will let us know to which extent this is going to translate into results useful to improve our knowledge of how software is developed, and to improve software development itself.

Wednesday, November 25, 2015

- A Cost/Benefit Approach to Performance Analysis

by David Maplesden, The University of Auckland, Auckland, New Zealand (@dmap_nz)Associate Editor: Zhen Ming (Jack) Jiang, York University, Toronto, Canada

Many large-scale modern applications suffer from performance problems [1] and engineers go to great lengths searching for optimisation opportunities. Most performance engineering approaches focus on understanding an application's cost (i.e., its use of runtime resources). However, understanding cost alone does not necessarily help find optimisation opportunities. One piece of code may take longer than another simply because it is performing more necessary work. For example, it would be no surprise that a routine that sorted a list of elements took longer than another routine that returned the number of elements in the list. The fact that the costs of the two routines are different does not help us understand which may represent an optimisation opportunity. However, if we had two different routines which output the same results (e.g., two different sorting algorithms), then determining which is the more efficient solution becomes a simple cost comparison.

The key is to understand the value provided by the code. It is then possible to find the superfluous activity that characterises poor performance.

Traditionally it has been left to the engineer to determine the value provided by a piece of code through experience, intuition or guesswork. However, the challenge of intuitively divining runtime value is difficult in large-scale applications. These applications have thousands of methods interacting to produce millions of code paths. Establishing the value provided by each method via manual inspection is not practical with such scale and complexity.

To tackle this challenge we are developing an approach to empirically measure runtime value. We can combine this measure with traditional runtime cost information to quantify the efficiency of each method in an application. This allows us to find the most inefficient methods in an application and analyse them for optimisation opportunities.

Our approach to quantifying value is to measure the amount of data created by a method that becomes visible to the rest of the application, i.e., the data that escapes the context of the method. Our rationale is that the value a method is providing can only be imparted by the visible results it creates. Intermediate calculations used to create the data but then discarded do not contribute to this final value. Intuitively two method calls that produce identical results (given the same arguments) are providing the same amount of value, regardless of their internal implementations.

Specifically we track the number of object field updates that escape their enclosing method. An object field update is any assignment to an object field or array element (e.g., foo.value = 1 or bar[0] = 2). A field update escapes a method if the object it is applied to escapes the method i.e. is a global (static), a method parameter or returned.

The formatTimePart method updates the StringBuilder parameter (via calls to sb.append()) and so it has parameter escaping field updates. The formatElapsedTime method has no parameter escaping updates but it does return a newString value (constructed via sb.toString()) and so has returned field updates. Note that the StringBuilder object sb does not escape formatElapsedTime and so the updates applied to it are actually captured by the method, it is only the subsequently constructed String which escapes. We have found captured writes such as these to be a strong indicator of inefficient method implementations.

We have evaluated our approach [2] using the DaCapo benchmark suite, demonstrating our analysis allows us to quantify the efficiency of the code in each benchmark and find real optimisation opportunities, providing improvements of up to 36% in our case studies. For example, we found over 10% of the runtime activity in the h2 benchmark was incurred by code paths such as JdbcConnection.checkClosed() that were checking assertions and not contributing directly to the benchmark result. Many of these checks were repeated unnecessarily and we were able to refactor and remove them.

Our proposed approach allows the discovery of new optimisation opportunities that are not readily apparent from the original profile data. The results of our experiments and the performance improvements we made in our case studies demonstrate that efficiency analysis is an effective technique that can be used to complement existing performance engineering approaches.

Wednesday, November 18, 2015

Community-based Open Source Software (OSS) projects leverage contributions from geographically distributed volunteers and require a continuous influx of newcomers for their survival, long-term success, and continuity. Therefore, it is essential to motivate, engage, and retain new developers in a project in order to promote a sustainable number of developers. Furthermore, recent studies report that newcomers are needed to replace older members leaving the project and are a potential source of new ideas and work procedures that the project needs [1].

However, new developers face various barriers when attempting to contribute [2]. In general, newcomers are expected to learn the social and technical aspects of the project on their own. Moreover, since delivering a task to an OSS project is usually a long, multi-step process, some newcomers may lose motivation and even give up contributing if there are too many barriers to overcome during this process. These barriers affect not only those interested in remaining project members, but also those who wish to submit a single contribution (e.g., a bug correction or a new feature). And, as Karl Fogel says in his book Producing Open Source Software: "If the project does not cause a good first impression, newcomers rarely give it a second chance."

To better support newcomers, it is necessary to identify and understand the barriers that prevent newcomers from contributing. With a better understanding of these barriers, it is possible to put effort towards building or improving tools and processes, ultimately leading to more contributions to the project. Therefore, we conducted a study to identify the barriers faced by newcomers [2][3]. We collected data from different sources: systematic literature review; open question responses gathered from OSS project contributors; students contributing to OSS projects; and interviews with experienced members and newcomers in OSS projects. Based on the analysis of this data, we organized 58 barriers into a model with 7 categories. Figure 1 depicts the categories and subcategories of this barriers model.

Figure 1. Barriers Model: Categories and Subcategories

Based on the barriers model, we built FLOSScoach, a portal to support the first steps of newcomers to OSS projects. The portal has been structured to reflect the categories identified in the barriers model. Each category was mapped onto a portal section which contains information and strategies aimed at supporting newcomers in overcoming the identified barriers. To populate the portal, we collected already-existing strategies and information from interviews with experienced members and from manual inspection of the project web pages.

In the portal, newcomers find information on the skills needed to contribute to a project, a step-by-step contribution flow, the location of features (such as source code repository, issue tracker and mailing list), a list of newcomer friendly tasks (if provided by the project) and tips on how to interact with the community. Preliminary studies have shown that FLOSScoach helps newcomers, guiding them in their first steps and making them more confident in their ability to contribute to a project [4]. When we compared students' performance with and without FLOSScoach, we found a significant drop in terms of self-efficacy among students in the control group (not using FLOSScoach) while the self-efficacy of students using the tool remained at a high level. In addition, by analyzing diaries written during the contribution process, we found evidence that FLOSScoach made newcomers feel oriented and more comfortable with the process, while those who did not have access to FLOSScoach repeatedly reported uncertainty and doubt on how to proceed.

Identifying and organizing the barriers and the development of FLOSScoach are the first steps towards supporting newcomers to OSS projects. A smooth first contribution may increase the total number of successful contributions made by a single contributor and, hopefully, the number of long-term contributors. According to the results of our studies conducted so far, the points that deserve more attention are facilitating local workspace setup and providing ways to find the correct set of artifacts to work on once a task is selected.

Monday, November 9, 2015

It gives me great pleasure to welcome all of you to the IEEE Software Blog. The goal of the blog is to present recent advances in the different research areas of software engineering via sharp, to-the-point, easily accessible blog posts. Furthermore, we will strive to not use our typical academic jargon, but distill the important takeaway messages from the research projects we are blogging about.Since most academic journals are not open access, it becomes nontrivial for practitioners to get their hands on the latest research, so this blog will discuss some of the great content in IEEE Software. Readers will also be able to discuss each post in the comments section. At the end of the day, we want practitioners to be able to easily access and apply the latest research advancements. Additionally, we will blog on well-informed opinions, new and disruptive ideas, book reviews, and future directions. We will also disseminate the posts via our social media accounts.

To this end, I have assembled a diverse, international team of young and upcoming researchers in different software engineering areas:

This team of blog editors will not only blog themselves but will also reach out to researchers and practitioners to solicit articles in their corresponding areas of expertise. The current plan is to have at least 6 blog posts every month, each in a different area of software engineering research.

So, if you have a new research finding or an opinion about some existing idea, please contact the appropriate blog editor above. Also, if you have feedback on how we can improve the blog please drop me a note. This blog cannot succeed without your participation!