Rapid7 Blog

Metric-driven Smart Deploys

POST STATS:

SHARE

Automated deployment isn’t just a wonderful thing — it’s a necessity when it comes to providing consistent, error-free delivery without eating up all of your team’s time and resources. You create a set of scripts to cover standard, known deployment cases (including the most likely contingencies), then stand back and let them do all of the heavy lifting. It doesn’t just save time compared to manual deployment — it also reduces errors, while at the same time providing you with a perfect breadcrumb trail for tracing bugs when they do occur.

And for continuous deployment, automation is a must — the new, QA-approved build goes into one end of the automated-deployment pipeline, and come out the other, installed, configured, and live.

Now, consider taking deployment to the next level — by making it smarter. What do I mean by smart deployment? Is it possible to examine log information from previous deployments and their aftermath, over time, to then adjust your deployment scripts?

The idea is that smart deployment has two phases of its own that parallel the manual and automated phases of standard deployment. During the first phase of smart deployment, you examine the relevant logs after each deploy, looking at key metrics such as process loads, hanging requests, standard iOPS stuff. Any deployment script has some basic assumptions about such things written into the deployment environment and the responses from the target systems. Log data also includes some implicit (if not explicit) assumptions about the post-deployment operating environment — the likely load under peak operating conditions, for example. When you study both the deployment and operating logs over time, you can build a picture of how well those assumptions hold up, then adjust your deployment script accordingly.

At this point, you’re manually examining the deployment logs, and manually adjusting the scripts, much like the process of manual deployment itself; but you are also developing a clearer and more subtle understanding of what goes on in the target system, both during deployment and post-deployment. And that is a key factor in the next phase — smart automated deployment.

Smart automated deployment is adaptive automated deployment. During this phase, you don’t simply take what you’ve learned from examining the deployment and operating logs and write it into your deployment script. Now that you understand what you’re looking at, and what you need to look for, you place that process in the script itself, so that the deployment system examines the logs, and makes decisions based on what it finds.

You begin, of course, by automating your log analysis in a way that brings together data from the entire range of development, deployment, and operations logs. (This is, as we discussed in a previous post, something which you should be doing anyway.) Your automated log analysis system picks out key metrics and passes them along to the deployment system, which makes decisions about the next deployment based on those metrics, orchestrating the various deployment tools under its control accordingly.

This means, for example, that if the logs show that a server is starting to approach the maximum number of user requests on a consistent basis, this information would get passed to the deployment system, which could then add a load balancer.

To do this in a way that genuinely is smart, of course, the log analysis tool or the deployment scripting system (or perhaps both) would need to not only recognize user requests as a key metric, but also recognize the difference between a pattern of high user request levels and intermittent request level peaks. Ideally, it would also be able to pick up a pattern of steadily increasing user requests over time, and to make some intelligent projections based on that pattern.

A smart deployment system might also be able to pick up on such things as increased demands on system resources due to other applications, or shifts in user behavior which change the demands placed on various resources. It should be able to detect changes in both environment and system and user behavior over time, to make some basic projections based on those changes, and to adjust configuration and resources in accordance with those projections.

In practice, of course, not all of these decisions can be automated, but once you’ve spent enough time studying the log analysis output, you should have a pretty clear idea of which ones can be handled by the scripting system, and which ones should be left in human hands.

None of this, by the way, needs to conflict with the basic principles of good automated deployment, and in fact, when done the right way, smart automated deployment should strongly support those principles. A smart, adaptive deployment script is the embodiment of infrastructure as code; the deployment system itself becomes an analytic and decision-making tool, doing what software is supposed to do – handling the kind of definable, programmable, regular, and frequently repetitive tasks that can be easily automated.

Does an adaptive deployment script violate the dictum that deployments should be repeatable? It shouldn’t — if it makes its decisions based on a digest of relevant metrics passed to it by the log analysis system, it should make the same decisions every time that it’s fed the same values for those metrics. Given the same initial conditions, it should produce the same results, just like any other well-behaved deployment script.

When would you want to use this kind of smart automated deployment?

High-volume websites, web-based applications, server-based applications, and other systems with a large user base generally are dynamic and changing, rather than static, so demands on the system change; any environment that is characterized by rapidly evolving system or user behavior requires continuous adaptation.
If a deployment environment is near-static, or changes only slowly and predictably over time, a static, non-adaptive deployment script may be good enough. But for a dynamic environment — particularly one with an expanding user base, rapid changes in user behavior, or shifting demands on system resources — a smart, adaptive deployment script may be just the ticket.