Back to Square One

I grew up playing quite a few different sports, both team and individual. For me, there was little else that I would rather have done than compete on the field, court, diamond, or course. I loved sport, and I loved to compete (still do as a matter of fact), so it was a great fit. Partially motivated by getting me to focus on something other than annoying the crap out of my little brother, my parents strongly encouraged my involvement in sports of all kind. For that, I will always be grateful. Not because I parlayed my athletic experience into a seven figure contract, flashy cars, and a private yacht (I am still open to those things though), but because sports taught me many, many lessons. These lessons went far beyond how to make a shot, hit a ball, or return a serve. Many of these lessons were equally applicable to sports and ‘real-life', even if I did not know it then.

Now, there's a chance you grew up in a similar fashion, but if you have gotten this far in the post (I'm sure I dropped some readers after that opening act), you are probably asking yourself what this has to do with anything remotely related to cloud computing?? Well, it all goes back to those lessons that sports taught me. When I look at ongoing work, that in many cases is laying the foundation for cloud-based environments, one of those old lessons jumps out at me: Sometimes, you just have to go back to the basics!

While this lesson is probably applicable to many things going on in the cloud space now, I want to hone in on virtual image construction in particular. Virtual images are nothing new, and many companies have been making use of them for quite some time. Given that, you may be thinking that users and image providers have mastered the art of image construction. If that is your belief, I can only tell you that you are not seeing the same thing as me.

In a significant number of cases, users I talk with that are using virtual images as a basis for their cloud or enterprise-wide virtualization efforts are flat out struggling to manage their virtual image inventory. Virtual images offer extreme consumability enhancements in environment deployment, and relatively speaking, are easy to create. This has been the perfect combination for an explosion in the volume of images a company needs to manage and maintain. Over time, this kind of virtual image sprawl can cripple or completely derail a company's cloud or virtualization efforts.

Now, you may be asking if virtual image sprawl were the eventual outcome, why would I even want to adopt the use of virtual images. The answer is because sprawl does not have to be the outcome. If we go back to the basics, the basics of effective virtual image construction that is, you can put your company in a good position to avoid a potentially crippling increase in virtual image inventory.

There is an important realization when building a virtual image. You cannot capture every piece of configuration for an environment and preserve it in a virtual image. This may seem basic but is often the fundamental mistake users make when constructing virtual images. For example, if a user is constructing a virtual image containing a web server, their initial reaction may be to preserve configuration information down to the level of proxy directives in the virtual image. It may make that virtual image highly consumable in that it requires zero configuration actions after deployment, but it also restricts its use to cases where those proxy directives apply. If someone wants to deploy that image with a different set of proxy directives, they have to deploy and perform manual updates, or worse yet, they take their direction from the author of the initial image and create a new image with the proxy directives they need. Now the company has two images that provide the same basic functionality. Clearly, we have a problem.

With that said, the first step in constructing a virtual image should be deciding what to install and capture directly into the image. These things are often obvious: large binaries, software with long-running installations, content common to most classes of the image's users, etc. The key here is fighting the temptation to stuff more and more content into an image because usually all that does is restrict its applicability to a constrained set of use cases.

The next step is a bit trickier and takes a little more design work on the part of the image author. Based on what you install into an image, you need to decide what variations of content configuration image deployers may need. For example, going back to our web server virtual image, different deployers may need different proxy directives in their deployed environment. This amount of variance does not warrant the creation of a unique virtual image, but you do not want to push that configuration work on the user either.

In order to allow variations of configuration for the deployed environment, you need to identify input parameters that deployers should be able to pass into the image deployment process. Once identified, you need a set of scripts that run at image activation time, act on those input parameters, and apply the desired configuration to the deployed environment. This is not a radical idea. In fact, it is the kind of activation framework model enabled by the Open Virtualization Format via its OVF envelope.

For completeness, let's look at what a web server virtual image may look like if constructed according to these concepts. First, we start by installing operating system and web server binaries. We may extend this to include other necessary components (i.e. enterprise-wide firewall software), but we do not capture much beyond basic binaries (little to no configuration). Once we have the basic components installed, we identify input parameters deployers should be able to specify. This may include proxy directives, cache directives, authentication configuration, and more. Once identified, we write up a few simple scripts that act on the input and configure the web server. We then wrap all of this up in a framework (like one enabled by OVF). The framework's job is to automatically call our scripts during image activation and ensure user input flows down to that execution process.

That is admittedly a very simple look at the process, but I think it provides a nice overview of an effective methodology for virtual image construction. If you are out there creating virtual images, take precautions against the curse of sprawl. I hope that these tips provide you with some ammo in that effort!

Dustin Amrhein joined IBM as a member of the development team for WebSphere Application Server. While in that position, he worked on the development of Web services infrastructure and Web services programming models. In his current role, Dustin is a technical specialist for cloud, mobile, and data grid technology in IBM's WebSphere portfolio. He blogs at http://dustinamrhein.ulitzer.com. You can follow him on Twitter at http://twitter.com/damrhein.