Category Archives: Software Reengineering

Last week I attended a presentation by Brendan Murphy from Microsoft Research – Cambridge. Dr. Murphy presented on his research regarding software development processes at Microsoft. This post contains a summary of the presentation and my thoughts on the material presented.

The discussion focused on the impact of particular architectural and processes decisions. For example, how does the choice of architecture effect the deployment of new source code. Consider a micro-service architecture in which several independent components are working in concert; it is likely that each of these components has a unique code base. Deployment of updates to one component must consider the effects on other components, i.e. a loss or change of a particular functionality may cause problems in other components. Perhaps using a more monolithic architecture reduces this concern however software monoliths are known to have their own problems.

The way that we, as software engineers and developers, manage variability in source code is through branching. Code repositories often consist of a number of related branches that stem from a “master” or main version of the code, branches often contain a specific feature or refactor effort that may eventually be merged back into the main branch. This concept is familiar for anyone who uses contemporary version control methods/tools such as Git.

In his presentation, Dr. Murphy discussed a number of different branching models, some of which have been experimented with at Microsoft at scale. Different approaches have pros and cons and will support different architectural and deployments in different ways. Of course, when considering the effect of different models it is important to be able to measure properties of the model. Murphy presented several properties: productivity (how many often in time files are changed), velocity (how quickly changes/features move through the entire process), quality (number defects detected and corrected in the process), and coupling of artifacts (which artifacts often change together). These can be used to evaluate and compare different branching models in practice.

The most interesting part of the presentation, from my point of view, was the discussion of the current work based on branching models and these properties. In theory, a control loop can be used to model an evolving code repository. At every point in time the properties are measured and then feed into a decision making unit which attempts to optimize the branching structure based on the aforementioned properties. The goal is to optimize the repository branching structure. Murphy indicated that they are currently testing the concept at Microsoft Research using repositories for various Microsoft products.

The main takeaways from the presentation were:

Deployment processes must be planned for as much, if not more, than the software itself.

Architecture will impact how you can manage software deployment.

Repository branching models have an effect on version management and deployment and need to be designed with care and evaluated based on reasonable metrics.

A control loop can theoretically be used to evolve a repository such that there is always an optimal branching model.

From my personal experience working as both a contractor and employee the point of planning for deployment is entirely correct. If software isn’t deployed or unaccessible to its users it is as if it doesn’t exist. Deployment must be considered at every step of development! The presentation did cause me to pause and consider the implications of architecture affecting the patterns of deployment and required maintenance. As we continue to move towards architectures that favour higher levels of abstraction, in my opinion, one of the primary concepts every software engineer should embrace, we will need to find ways to manage increasing variability between the abstracted components.

The notion of the control loop to manage a branching model is quite interesting. It seems that we could, in theory, use such a method to optimize repositories branching models, but in practice the effects of this might be problematic. If an optimum is able to be found quickly, early on in a project, then it seems that this is a good idea. However, if weeks and months go by and the branching model is still evolving this might cause problems wherein developers spend more time trying to adapt to a continually changing model, rather than leveraging the advantages of the supposed “optimum” model. However, a continually changing model may also force development teams to be more aware of their work which has benefits on its own. Really this idea needs to be tested at scale in multiple environments before we can say more, an interesting concept nonetheless.

Bidirectional Transformations (BX) are a specific type of transformations of particular interest for many applications in software and information system engineering. This Winter I co-organized a one week seminar on BX theory and applications at the Banff International Research Station (BIRS). BIRS was an excellent venue and the seminar was quite worthwhile, as it provided a way of getting leading researchers from different communities to exchange their ideas and theories (despite arctic temperatures of -20 to -40 C) . A report on the seminar is now published at the BIRS Web site.

The next BX workshop will be coming up in Athens as part of the EDBT/ICDT joint conference. There I will be co-presenting a paper in the application of BX in support of information system reengineering.