SaltStack for Flexible and Scalable Configuration Management

Configuration management is the foundation that makes modern infrastructure possible. Tools that enable configuration management are required in the toolbox of any operations team, and many development teams as well. Although all the tools aim to solve the same basic set of problems, they adhere to different visions and exhibit different characteristics. The issue is how to choose the tool that best fits each organization's scenarios.

This InfoQ article is part of a series that aims to introduce some of the configuration tools on the market, the principles behind each one and what makes them stand out from each other. You can subscribe to notifications about new articles in the series here.

The state of the union

A couple of months ago at a combined meeting of the Montreal Python and DevOps user groups I gave a presentation on the various tooling options available for configuration management. Most systems administrators, developers and IT operations pros utilize some sort of tooling, whether home grown, from open source or commercially sourced, to automate infrastructure and configure all the things that keep our technology working as expected. In this presentation I attempted to provide an objective view of the most popular options. Depending on who you ask, how they think, and what they are managing, almost everybody has a favorite.

In the olden days of computing, when a company only had a handful of servers to maintain, IT people manually and frequently tuned and re-tuned these servers to keep their LOLcats website available and running smoothly.

A lot has changed since then, and a lot has changed in just the last few years. Massive data centers and scaled-out server farms now dominate and it is no longer reasonable to manage individual servers by hand. Various configuration management tools are now available, but even some of the tools built within the last decade were not designed to accommodate the levels of scale prevalent today.

And this is exactly why SaltStack was created. While we have hobbyists running SaltStack masters on a Raspberry Pi, or using it to manage a home network with a couple of servers in a basement, SaltStack is built for speed and scale. This is why it is used to manage large infrastructures with tens of thousands of servers at LinkedIn, WikiMedia and Google. The market didn’t need just another configuration management tool, it needed configuration management built for modern, Web-scale infrastructure. In this article I will focus on the SaltStack approach to configuration management.

A fast remote execution platform comes first

SaltStack was originally created to be an extremely fast, scalable and powerful remote execution engine for efficient control of distributed infrastructure, code and data. Later, configuration management was built on top of the remote execution engine and leverages the same capabilities of the core.

"SaltStack can do that," is a common refrain in the Salt community. To Salt users, it is their data center Swiss Army knife. The persistent, yet lightweight SaltStack master / minion topology provides extensive control and functionality within any environment all from a single platform without third-party dependencies.

The Salt master is a daemon that controls the Salt minions. The Salt minion is also a daemon that runs on the controlled system and receives commands from the Salt master. SaltStack is designed to handle ten thousand minions per master, and this is being very conservative. This scale is possible through asynchronous, parallel command and control for real-time management and communication within any data center system.

For lighter-weight use cases that don’t require real-time control or extreme speed or scale, SaltStack also offers Salt SSH which provides an “agentless” systems management.

Also, remote execution and configuration management are better together. Each can only go so far before needing the other to provide true infrastructure automation and control. The SaltStack platform includes the Salt Reactor system for event-driven activities like continuous code deployment or autoscaling resources.

A basic understanding of the SaltStack event system is required to understand the SaltStack reactor system. The event system is a local ZeroMQ PUB interface which fires SaltStack events. This event bus is an open system used for sending information notifying SaltStack and other systems about operations. The event system fires events with a very specific criteria. Every event has a tag. Event tags allow for fast top-level filtering of events. In addition to the tag, each event has a data structure. This data structure is a dictionary, which contains information about the event.

The SaltStack reactor system consumes SaltStack events and executes commands based on a logic engine that allows events to trigger actions. SaltStack can consume and react to its own events or to events from other systems like Jenkins.

The recent Heartbleed vulnerability is a good example of how our customers are using SaltStack to control all the bits and pieces of an infrastructure. SaltStack was used to diagnose and remediate Heartbleed in milliseconds across large infrastructures. For example, these tweets from WebPlatform.org and WikiMedia highlight how easy SaltStack made the fix:

Infrastructure as data, not code

SaltStack has reduced the learning and adoption curve for configuration management by implementing an "infrastructure as data" approach which is substantially less complex than traditional “infrastructure as code” methods, while not sacrificing any functionality or capability. “Infrastructure as code" typically requires users to understand a complex language of machine code or a domain specific language. The SaltStack approach is human readable, and of course the machines easily consume it as well.

While written in Python, SaltStack configuration management is language agnostic and utilizes simple, human-readable YAML files and Jinja templates.

DevOps and Web-scale IT require speed, agility and communications. The smaller the learning curve, the bigger competitive advantage available. Significant, yet unnecessary, investments in "infrastructure as code" hinder innovation and deployment in what should be a very fast-moving discipline of getting servers and the software that runs on them into a stable, reusable, production-ready state as quickly as possible. Why take the Space Shuttle to get to the corner market when it is easier to walk or ride a bike?

Extreme flexibility

SaltStack is comprised of many different modular layers all leveraging the same, fast communication bus which allows for parallel communications with as many servers as need to be told what to do. These layers of commands and routines and functions provide expansive control over a computing infrastructure and all the data center things. Many of our users will use SaltStack to tell Puppet manifests what to do because SaltStack is highly efficient remote execution at its core, just like SaltStack can be used to manage any other piece of software, or cloud or virtualization.

For the past few years there has been a holy war of sorts between folks who prefer either a declarative or an imperative approach to configuration management. We say, “stick a fork in that debate.” SaltStack can be used for either declarative or imperative configuration management depending on how your brain works or on how your systems need to be managed.

SaltStack configuration management can either execute in an imperative fashion where things are executed in the order in which they are defined, or in a declarative fashion where the system decides how things are executed with dependencies mapped between objects.

Imperative ordering is finite and generally considered easier to write, but declarative ordering is much more powerful and flexible but generally considered more difficult to create.

SaltStack has been created to get the best of both worlds. States are evaluated in a finite order, which guarantees that states are always executed in the same order, and the states runtime is declarative, making Salt fully aware of dependencies. Salt always executes states in a finite manner, meaning that they will always execute in the same order regardless of the system that is executing them. But the state_auto_order option was recently added to SaltStack to make states get evaluated in the order in which they are defined in Salt state files.

The evaluation order makes it easy to know what order the states will be executed in, but it is important to note that the requisite system will override the ordering defined in the files. The order option described below will also override the order in which states are defined in Salt state files.

There is power in the construct when SaltStack provides hooks (such as “on fail” or “on change”) into a declarative routine that allow it to fork the routine in the middle of a configuration management run if it fails at first but might work if tried a different way. SaltStack pre-requisite is another example of this. A pre-requisite does a thing to a system but only does the thing if something else is going to happen in the future. It is in-line predictive analysis in an idempotent way. It asks the question, “Am I about to deploy code? Yes? Then let’s take this server out of the load balancer or shut down Apache, but only if I’m going to make a change to the system.” Or the SaltStack “fail hard” flag further gives power to imperative configuration management by altering the flow of how things get deployed, instead of just bailing on the routine.

Using SaltStack to install a LAMP stack on Red Hat

A simple scenario for SaltStack configuration management is installing a LAMP stack. While more complete formulae exist in the SaltStack formulas organization on GitHub (https://github.com/saltstack-formulas), this example should sufficently demonstrate the basics of a formula which is a pre-written Salt state for configuration management of a specific resource. SaltStack formulas and states can be used for tasks such as installing a package, configuring and starting a service, setting up users or permissions, and many other common tasks.

Because this example is designed for Red Hat Linux-based environments, which include Python 2.6 as part of the current base installation, the Linux and the Python part are already finished. All that needs to be done is to set up an Apache web server, and a MySQL database server for the web application to use.

Before starting, ensure that the /srv/salt/ directory exists on the server.

Because a web application needs its database server to be installed before it can be functional, we will define the database server first.

Because this example does not include any files other than the declaration, we can store it simply as /srv/salt/mysql.sls. However, the Apache installation is more complex, because it includes a configuration file. This file is copied up to the web server using the file.managed function, which supports enhanced functionality such as templating. To accommodate this, create an apache/ directory inside of /srv/salt/, with the following file:

Because more files are involved in this formula, we create an entire directory to store them in. This directory includes an init.sls, as well as a copy of the httpd.conf file that is being managed. These are now tied together with a top.sls file:

# /srv/salt/top.slsbase: web*: - apache db*: - mysql

This file is the glue that holds these states together, and defines which servers have which states applied to them. Note that this file does not refer to any specific paths. That is because SaltStack will look inside the same directory as the top.sls file for the names defined here. When SaltStack sees the name in the top.sls, it will look for either a .sls file corresponding with its name (for example, mysql.sls) or a directory corresponding with that name, which includes an init.sls file (for example, apache/init.sls).

This definition will ensure that any servers whose names start with “web” (such as web01 or even web01.example.com) will have the Apache state applied to them, and any servers whose names start with “db” (such as db01 or db01.example.com) will have the MySQL state applied to them.

To apply these states to all servers, you need to kick off a highstate run:

salt ‘*’ state.highstate

A highstate is the combination of state data (packages, services, files, etc.) that will be applied to the target system.

However, this presents another challenge. As I mentioned above, these web servers are essentially useless without working database servers. This scenario works fine if both already exist, and we are just adding new servers to the mix, but what about a clean setup with no servers at all?

This is where SaltStack orchestration engine comes in. It is possible to define which order machines are deployed, by defining which order states are to execute:

This defines that the web state that we have defined will not be allowed to execute until the database state that we defined has finished running. In order to kick off this state, run:

salt-run state.orchestrate myorchestration

Note: in salt 0.17.x, this command would be:

salt-run state.sls myorchestration

Addressing configuration drift

The above scenario is fine for initially provisioning a group of servers. If run on a schedule, it will also mitigate issues with configuration drift: if the httpd.conf file gets changed on the server, SaltStack will set it right back where it needs to be, and report to the user what changes were made to enforce the correct state. But what about package versions?

When a pkg.installed state is declared, SaltStack will check with the underlying package manager to see if that package is already installed. If it is, then the state has been achieved, and no further action is performed. However, if it is not installed, it will tell the package manager to install that package, which (depending on the environment) will normally search for the latest available version of that package and install that.

Over time, this can result in a number of servers containing different versions of a package, which can result in issues which can be difficult to troubleshoot. One solution is to use the pkg.latest state instead, to always make sure that all servers are always running the latest version of a package:

httpd: pkg.latest: - name: httpd

However, this can also be problematic. As soon as a new version is made available, all of the servers will try to download and install it. If you’re not expecting a new version, and you haven’t had time to perform your own internal testing, this can cause serious problems. It’s much better to lock down packages to a specific version:

httpd: pkg.installed: - name: httpd - version: 2.2.15

This ensures that packages will not be upgraded until the state declaration has been explicitly updated to do so.

SaltStack test mode

Another important feature involves knowing before changes are made, which changes are about to be made. This is as easy as adding another option to the highstate command:

salt ‘*’ state.highstate test=True

When running in test mode, states that are already where they need to be will be displayed in green, whereas states that are not yet applied will be displayed in yellow. Many users find it critical to be able to see what changes need to be performed, before they are actually performed.

Of course, test most is also available with the orchestration engine:

salt-run state.orchestrate myorchestration test=True

This will evaluate the highstate, in the order defined in the myorchestration.sls file, and display in the same manner what changes would be made if the command were run outside of test mode.

Conclusion

SaltStack has a distinct second-mover advantage in configuration management, but don’t take our word for it. It is very easy to get up and running with SaltStack and we have an extremely vibrant and helpful community to help do-it-yourselfers along the way, or we have SaltStack Enterprise for organizations looking for assistance from the SaltStack services and support team.

About the Author

Joseph Hall has been around the block. He has worn hats for technical support, QA engineer, web programmer, Linux instructor, systems administrator, director and cloud computing engineer. Somewhere along the line he also became a trained chef and bartender. He was the second person to commit code to the Salt project, and currently works at SaltStack as a core developer leading Salt Cloud development efforts.

Configuration management is the foundation that makes modern infrastructure possible. Tools that enable configuration management are required in the toolbox of any operations team, and many development teams as well. Although all the tools aim to solve the same basic set of problems, they adhere to different visions and exhibit different characteristics. The issue is how to choose the tool that best fits each organization's scenarios.

This InfoQ article is part of a series that aims to introduce some of the configuration tools on the market, the principles behind each one and what makes them stand out from each other. You can subscribe to notifications about new articles in the series here.