Thursday, October 27, 2011

Software deployment complexity

In this blog post, I'd like to talk about the software deployment discipline in general. In my career as PhD student and while visiting academic conferences, I have noticed that software deployment is (and has never been) a very popular research subject within the software engineering community.

Furthermore, I have encountered many misconceptions about what software deployment is supposed to mean and even some people are surprised that people do research in this field. I have also received some vague complaints of certain reviewers saying that things that we do aren't novel and comments such as: "hmm, what can the software engineering community learn from this? I don't see the point..." and "this is not a research paper".

What is software deployment?

So what is actually software deployment? One of the first software deployment papers in academic research by Carzaniga et al [1], describes this discipline as follows:

Software deployment refers to all the activities that make a software system
available for use

Some of the activities that may be required to make a software system available for use are:

Building software components from source code

Packaging software

Transferring the software from the producer site to consumer site

Installation of the software system

Activation of the software system

Software upgrades

An important thing to point out is that the activities described above are all steps to make a software system available for use. I have noticed that many people mistakenly think that software deployment is just the installation of a system, which is not true.

Essentially, the point of software deployment is that a particular software system is developed with certain goals, features and behaviour in mind by the developers. Once this software system is to be used by end-users, it typically has to be made available for use in the consumer environment. Important is that the software system behaves exactly the way the developers have intended. It turns out that, for many reasons, this process has become very complicated nowadays and it is also very difficult to give any guarantees that a software system operates as intended. In some cases, systems may not work at all.

Relationship to software engineering

So what has software deployment to do with software engineering? According to [2] software engineering can be defined as:

Software Engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.

Within the software engineering research community, we investigate techniques to improve and/or study software development processes. Typically, the deployment step is usually the last phase in a software development project, when the development process of a software system is completed and ready to be made available to end-users.

In old traditional waterfall-style software development projects, the deployment phase is not performed so frequently. Nowadays most software development projects are iterative in which features of the software are extended and improved, so for each cycle the system has to be redeployed. Especially in Agile software projects, which have short iterations (of about 2 weeks) it is crucial to be able to deploy a system easily.

Because of the way we develop software nowadays, the deployment process has become much more of a burden and that's why it is also important to have systematic, disciplined, quantifiable approaches for software deployment.

Apart from delivering systems to end-users, we also need to deploy a system to test it. In order to run a test suite, all necessary environmental dependencies must be present and correct. Without a reliable and reproducible deployment process, it becomes a burden and difficult to guarantee that tests succeed in all circumstances.

Why is software deployment complicated?

Back in the old days, software was developed for a specific machine (or hardware architecture), stored on a disk/tape and delivered to the customer. Then the customer loaded the program from the tape/disk into memory and was able to run the program. Apart from the operating system, all the required parts of the program were stored on the disk. Basically, my good old Commodore 64/128 worked like this. All software was made available on either cassette tapes or 5.25 inch floppy disks. Apart from the operating system and BASIC interpreter (which were stored in the ROM of the Commodore) everything that was required to run a program was available on the disk.

Some time later, component based software engineering (CBSE) was introduced and received wide acceptance. The advantages of CBSE were that software components can be obtained from third parties without having to develop those yourself and that components with the same or similar functionality can be shared and reused across multiple programs. CBSE greatly improved the quality of software and the productivity of developers. As a consequence, software products were no longer delivered as self-contained products, but became dependent on the components already residing on the target systems.

Although CBSE provides a number of advantages, it also introduced additional complexity and challenges. In order to be able to run a software program all dependencies must be present and correct and the program must be able to find them. There are all kinds of things that could go wrong while deploying a system. A dependency may be missing, or a program requires a newer version of a specific component. Also newer components may be incompatible with a program (sometimes this intentional, but also accidentally due to a bug on which a program may rely).

For example, in Microsoft Windows (but also on other platforms) this lead to a phenomenon called the DLL hell. Except for Windows DLLs, this phenomenon occurs in many different contexts as well, such as the JAR hell for Java programs. Even the good old AmigaOS, suffered from the same weakness although they were not that severe as they are now, because the versions of libraries didn't change that frequently.

In UNIX-like systems, such as Linux, you will notice that the degree of sharing of components through libraries is raised to almost a maximum. For these kind of systems, it is crucial to have deployment tooling to properly manage the packages installed on a system. In Linux distributions the package manager is a key aspect and also a distinct feature that sets a particular distribution apart from another. There are many package mangers around such as RPM, dpkg, portage, pacman, and Nix (which we use in our research as a basis for NixOS).

Apart from the challenges of deploying a system from scratch, many system are also upgraded because (in most cases) it's too costly and time consuming to deploy them over and over again, for each change. In most cases upgrading is a risky process, because files get modified and overwritten. An interruption or crash during an upgrade phase may have disastrous results. Also an upgrade may not always give the same results as a fresh installation of a system.

Importance of software deployment

So why is research in software deployment important?

First of all, (not surprisingly) software systems become bigger and increasingly more complex. Nowadays, some software systems are not only composed of many components, but these components are also distributed and deployed on various machines in a network working together to achieve a common goal. For example, service-oriented systems are composed this way. Deploying these kinds of systems manually is a very time consuming, complex, error prone and tedious process. The bigger the system gets, the more likely it becomes that an error occurs.

We have to be more flexible in reacting to events. For example, in a cloud infrastructure, if a machine breaks, we must be able to redeploy the system in such a way that services are still available to end-users, limiting the impact as much as possible.

We want to push changes to a system in production environment faster. Because systems become increasingly more complex, an automated deployment solution is essential. In Agile software development projects, a team wants to generate value as quickly as possible, for which it is essential to have a working system in a production environment as soon as possible. To achieve this goal, it is crucial that the deployment process can be performed without much trouble. A colleague of mine (Rini van Solingen), who is also a Scrum consultant, has covered this importance in a video blog interview.

Research

What are software deployment research subjects?

Mechanics. This field concerns the execution of the deployment activities. How can we make these steps reproducible, reliable, efficient? Most of the research that I do covers deployment mechanics.

Deployment planning. Where to place a component in a network of machines? How to compose components together?

Empirical research covering various aspects of deployment activities, such as: How to quantify build maintenance effort? How much maintenance is needed to keep deployment specifications (such as build specifications) up to date?

Where are software deployment papers published? Currently, there is no subfield conference about software deployment. In the past (a few years before I started my research), there were three editions of the Working Conference on Component Deployment, which is no longer held since 2005.

Most of the deployment papers are published in various conferences, such as the top general ones, subfield conferences about software maintenance, testing, cloud computing. The challenging part of this is that (depending on the subject) I have to adapt my story to the conference where I want my paper to be published. This requires me to explain the same problems over and over again and integrate these problems with the given problem domain, such as cloud computing or testing. This is not always trivial to do, nor will every reviewer understand what the point is.

Conclusion

In this blog post, I have explained what software deployment is about and why research in this field is important. Systems are becoming much bigger and more complicated and we want to respond to changes faster. In order to manage this complexity, we need research in providing automated deployment solutions.

16 comments:

thank you for a wonderful blog post. This rings all too familiar and sounds a lot like my own experiences with getting my papers on the release, delivery, deployment, and activation and usage processes published.

I too received stupid questions all too often, like "but soon everything will be offered as a service, so why worry about deployment?"

I think that the area of software deployment is not by longshot completely defined at the moment, and I think there are still multiple questions that remain open. You've mentioned some already, but I would like to also add the field of cluster deployments, such as upgrading a server or component in a cluster of servers that have to provide a service 24x7, like facebook and Google, without any downtime or performance downgrade.

Don't lose the faith and please note that I have been successful at conferences like CSMR and ICSM. Also, why not get courageous and organize your own component deployment?

- When writing papers, it might help to sell the idea by pointing out how some companies deal with this. For example, Google has entire roles devoted to this: site-reliability engineers whose responsibilities include release management.

- Noticing that "alpha" environments, "build servers" and the like are becoming ESSENTIAL to the modern software development teams I'm working with...

- Data migration is a HUGE factor. When you have hundreds of thousands of users who have been using your software over several years, and you have to deploy an update, making sure all the old data migrates into your new system (new datatables, object models, etc) is a CRAZY problem.

Sad but true: most researchers (and many develpers!) never had any one use their software (especially a lifetime over a year) so they have no perspective on this.

Indeed, I have raised some questions, but you're absolutely right! There are many more questions unanswered.

"but soon everything will be offered as a service, so why worry about deployment?", is a good one :-)

I mean this totally untrue. Although with web applications/services consumers are no longer bothered by installing software on their machines (except for the fact that they may need an up to date and compatible browser), but still the components need to be deployed somewhere.

I'd say that the consumer environment for these kind of systems has shifted to the data centers, with all it's associated complexities. I think in these deployment scenario's new problems arise, of which you have mentioned a few.

Organizing my own conference? hmm.. that sounds interesting, although I have absolutely no experience with this and this is also going to take some time, I guess. I'm open to all suggestions :-)

Usually when I write papers, I try to use as many concrete examples as possible and convince readers that it's useful for their problem domain as well. Sometimes it works, sometimes it does not.

Data migration is indeed a very important aspect I haven't covered here. So far, this is actually one of the remaining open issues in our deployment tooling and research. I have some plans for dealing with this as well, but that's (unfortunately) still work in progress.

You're absolutely right about the fact that many researchers and developers write software that almost nobody uses, nor they have sufficient experience with the final phase of a development cycle in software projects.

LISA (Large Installation System Administration Conference) devotes a bit of time to this topic. They are the folks who need to deploy it in the cloud! Their conference is going to be held in Boston this year, December 4–9, 2011. http://www.usenix.org/events/lisa11/index.html

Cool to see that research is being done in this field. I am the CTO of XebiaLabs, a vendor of enterprise deployment automation software (and a graduate of the University of Amsterdam :-)) and as such have some experience in this field. Just like you mention, I also see agile development as one of the drivers for deployment becoming a more important topic. One can no longer get away with a sloppy deployment process if you are doing it again and again. The "hidden cost of deployment" are becoming ever more visible. That's also why a lot of people are talking about Devops right now.

One of the interesting things about the subject is that it is still not clear what "deployment" is. Some people are more focused on the process (release management) side of things while other are more into the technical (configuration management) side of things. And then there's the whole being in between dev and ops side of things!

if you wanna meet up and talk some more about deployment, let me know! I am based in Amsterdam.

Interesting article. I have become so fed up with the complexities of deploying software that I actually wrote a commercial deployment tool from scratch to help ease the pain. Please check it out and let me know what you think. http://www.laneware.net/DSE/DeploymentStudioExpress.aspx

Nice article. I am a PhD student who is working in this domain as well, mainly on the area of deployment planning with requirement modeling. I read both Slinger (Deployment using Feature Model) and yours (DisNix) articles. They are very interesting articles. Hope to exchange more experiences from you guys. Thanks.

Good to hear that there are still people around doing research in software deployment.

The research I'm doing is more about the "mechanics" rather than deployment planning, although I have one paper (the SEAMS paper), in which I integrate several deployment planning techniques using a framework that allows you to integrate your own algorithms, which can be used to perform the entire deployment process.

Have you published any papers so far? and do you have some pointers to your research? I'm interested.

Thanks for your quick reply. It is good to hear some experiences from you.

Precisely, my research is related to several domains: Distributed deployment, context-awareness and deployment requirements. I am finding the answer to the question "Why", when deploying/allocating application services in a distributed environment. Say, based on what information(requirements) and how can we use these information for matching application services(most likely component-based) and execution nodes. So, I believe it is more about deployment planning areas. It is sad that I have been rejected twice for my publication and I am still seeking my first successful one.

I do have some questions on your researches. In DisNix, the bindings between application services and node infrastructures are described in a file called "distribution.nix", it seems that they are decided manually, am I right? I knew that Nix expression allows some constraint descriptions, but I can't find a completed requirement model for describing deployment requirements, are there any related considerations? Please correct me if I am wrong. :)

I am sure that your experiences in the domain could be some valuable pointers for my research, it is good that if we could have further discussions. Thanks for your interest.

As you may have noticed after reading this blog post, publishing software deployment papers is challenging as it is a very uncommon research subject and generally not understood by the academic world (apart from some exceptions, of course :-) ).

Furthermore, we have no subfield conferences specialized in our discipline, so typically we have to publish our papers elsewhere. I have had a number of success publishing at "other" conferences as well as a number of failures. Apart from that, I struggled with some other annoyances with my research for a while (as I have described in a few more recent blog posts, such as: Software engineering fractions, collaborations and "The System").

Something that helped me a lot is starting a blog. For me it helps structuring my mind and divide bigger problems in smaller problems and to solve them separately. Also, blog posts are typically not bound to all these silly "rules" such as page limits, word limits, subjects, deadlines and so on.

And you should also keep trying and try to dive into your target domain as much as possible and integrate your "solution" with it. While doing my research I also learned a thing or two about cloud computing, testing, reliability engineering and self-adaptive systems. It is not always easy to do, but it is the best solution in getting your paper accepted somewhere.

If you like to discuss more about deployment or if you want me to have a closer look at what you're doing, perhaps it is a good idea to send me an e-mail.

About Disnix: distribution.nix indeed maps services to machines statically. However in the SEAMS paper I have implemented "dydisnix" which is a layer on top of the "regular Disnix". This toolset generates the infrastructure.nix (using a discovery service) and distribution.nix (using deployment planning algorithms) models dynamically, based on technical and non-functional properties.

Thanks for your opinions. Yes, I read your comments about the bias of some reviewers. And yes, it is interesting that there are lots of on going researches on software deployment but lack of explicit conferences or workshops for the domain.

For writing down my research explorations, I always use paper and pencil. And yes, I think it is time for me to start a technical blog.

Also, I read the SEAMS article after my previous reply and I find that there are 2 more file types(qos.nix and augment.nix) for describing deployment requirements. It is very interesting to have some further explorations.

And here is my email: anthony.lee@telecom-bretagne.euI am looking forward to share and discuss with you. Many thanks.