Security From the Ground Up, Part 1: Choosing the Right Operating System

If you’re anticipating a move to the cloud — or you’re continuing to grow and scale in the cloud — it’s a best practice to include security in your thinking from the outset. Now’s the time to incorporate security into your cloud strategy so you can build it in from the ground up.

Over the coming weeks, I will be discussing how to think about important decisions that should be made early on to ensure an easy-to-maintain, long-term security posture, and how to use Threat Stack to build a secure base for your applications before you deploy systems to production.

In this first installment, I want to discuss some important aspects of cloud operating systems. I will be talking specifically about AWS in certain cases, but my recommendations apply to any cloud provider and also to both Windows and Linux.

Choosing an operating system for your platform is a complex and important decision. Compatibility with existing tooling, ease of management, and ultimately, your familiarity with the platform, are often the determining factors when you choose the operating system. For reasons we are all familiar with these (sometimes) near-sighted choices often become long-term commitments for operations professionals. In the cloud, this can be further complicated by the source of the operating system as well as the way those initial cloud images are built and deployed.

When choosing an operating system to launch your cloud products on, you need to think of three things before you even start considering role-based access, authentication, or network security systems:

The provenance of the cloud images: Who built them? What is already installed and configured? What processes are configured to launch, and what services start automatically on these systems? Have you considered building your cloud images? Why are you not building official sources or building your own AMIs?

The support status of the OS versions you choose: Is the operating system you are using currently in support? Is the OS End-of-Life’d (EOL)? Is it still receiving important security updates?

Test environment and process:Do you have an environment or process to test new or updated operating systems for your cloud images?

Provenance

Provenance is a term that is most often related to art, and it means the place of origin or the earliest known history of something. It also means a record of ownership of a work of art or an antique, and as such is used as a guide to authenticity or quality when art is being purchased. By displaying a clear chain of ownership, a record of provenance can guarantee, to a degree that is acceptable for insurers, that the art is both original and has not been stolen or purchased on the black market. For the purpose of cloud images, by knowing the provenance, you can be assured that you are running an official source and not, perhaps, a third-party-built OS with potentially compromised software preinstalled.

While history is important, simply relying on information that is available in the cloud marketplaces is insufficient because the provenance is often obscured or completely unavailable. Fortunately there are sources you can consult for reliable information about the available official images as well as information about what operating systems and versions are regularly supported with security updates. The official sources of some popular Linux distributions are as follows:

RHEL: Hard to tell… on Amazon Marketplace, the official Red Hat images are marked as “Sold by Amazon Web Services.”

Whether you are building your own images using a tool like Packer or launching raw images from a cloud marketplace, you should always use official sources at base and skip third-party images that may have a tool or platform prepackaged for you. If you want to prepackage tools or platforms you use on these images, look into leveraging your config management system or the awesome shell scripting skills of your developers to build your own cloud images with the things you want installed.

Support Lifecycles

The official sources of these operating systems also provide information about long-term support and security updates. Usually support and maintenance windows extend for 3‒10 years, depending on your platform. But sometimes not all versions of these operating systems are supported over the same amount of time. This can make patching security fixes manually a painful, impractical, and very likely, impossible requirement of maintaining the security of your infrastructure.

Traditional IT would regularly say they are picking a platform — and once everything works, are locking down the infrastructure, development, and operations as part of a long-term decision to stick to that platform. We regularly hear about organizations running 10+ year old operating systems and technology. These support windows even allow long periods of time for operations teams to migrate and stabilize on new versions. The following image is of the Ubuntu lifecycle, but nearly all operating system distributions have these long periods of support overlap that are still often ignored for the ease of simply maintaining deprecating versions.

Once these organizations move into the cloud, however, the risk evaluation needs to change. This is the impetus around cloud initiatives like Amazon’s Shared Responsibility Model, and the ROI from building once and never upgrading diminishes greatly as the ecosystem changes at a more rapid pace and is fundamentally more connected than a traditional datacenter.

The easiest way to take advantage of the long-term cost savings of the evolving cloud platforms is to make sure you are staying up to date in the environment. And by this I simply mean update your operating systems.

Operating systems nearly always have support timeframes, or end-of-life (EOL), or end-of-support dates, and these timeframes matter dramatically in maintaining the security of your systems because security updates are tied directly to these lifecycles. It is either impossible or cost-prohibitive to go outside these support windows and still maintain a viable security posture. I often say “You cannot use outdated operating systems unless you have an excellent team of OS contributors and kernel maintainers on staff.” This, of course, is an absurd proposition, but it’s absurd to be outside of operating system support in the first place, unless, of course, you hire a skilled team of OS contributors and kernel maintainers, which is also absurd (and even more absurd if you are on Windows and do not work for Microsoft)!

In addition to the reduced security posture, there are greater direct and indirect costs associated with not updating your operating systems — not to mention potential missed opportunities that are available in a provided environment (for example, improved networking, lower hardware costs and easier upgrades, better CPU access, etc.).

The indirect costs associated with upgrading to a new version of an operating system are much more significant if there are no practical tools in place for more regular OS updates and upgrades (kernel or distribution). Obviously, regular maintenance to operating systems will reduce long-term costs associated with these upgrades.

Note: Check out the following links for excellent information on OS support lifecycles:

The Pain is Real: A Practical Example and a Debacle

The direct costs of not upgrading are real and almost completely avoidable. For example, CentOS — as a piece of community-maintained software — is not inherently less secure than Red Hat with a support contract. However, it is by definition, less costly. The difference between paid support with Red Hat support contracts and a supported version of CentOS is neglible. But when the free CentOS is no longer backporting changes to libraries or shipping security updates, it makes sense to move to Red Hat if there is no desire or practice to update to a newer OS version. This is a real money cost associated with not having viable system upgrade practices for a free distribution, since CentOS’s subsequent versions would likely be getting the same security update that cost real money to backport on a Red Hat support contract. For example if you need to support Ruby 1.8.7, both Centos6 and RHEL 6 will support security updates for this until 2019, but Red Hat would continue to ship security updates and library support for Ruby 1.8.7 beyond the close of the CentOS support lifecycle in 2019.

Windows XP is probably the greatest example of unnecessary long-term costs associated with not upgrading systems. Microsoft offered extended support contracts because enterprises believed that Windows XP would work in the long term, as if Internet Explorer and the Internet were not progressively diverging. These were contracts that only served to reinforce bad operations practices and non-realities about the effect of software on asset deprecation. Then last year Microsoft doubled the cost of these contracts to $400/machine and the service contract cap to $500,000 annually. These are charges that are simply added to vendor costs as a result of bad practices, nothing more. As such, they are completely avoidable. And even in this scenario it would have been cheaper for organizations to have upgraded to Windows 7, (which also would have improved their security posture in ways that could not have been backported to Windows XP). These same contracts also exist and are executed for Windows Server, but because of the smaller number of deployments, the problem isn’t on the same scale as the XP debacle (although the same misunderstanding of risk management could lead to the same scale of operational failure).

Security Really Can Be Free

Once in the cloud, there are two simple ways to make sure these costs are a thing of the past:

As easy as it is for IT or Operations to choose a platform they have the best tools for, or the most familiarity with, (or maybe your developers love an old version of a library), it is important that teams build on a solid, maintainable, and secure foundation. Even though it is easy to install a system in the cloud and then simply forget about it or freeze the version (which may make a “stable” platform for developers), the likelihood of introducing security issues over the long-term dramatically increases as support cycles and security upgrades are ignored.

Looking Ahead . . .

In the coming weeks, I will provide information on a practical operations pattern using Threat Stack’s Vulnerability Management to regularly upgrade and test these operating systems over time rather than simply hope that new OS versions and upgrades will be available over the long term.

Apollo Catlin is a Senior Operations Engineer on Threat Stack’s Operations team. With 10 years of experience in quality assurance, automation, and build engineering, Apollo understands the challenges of building, deploying, and maintaining high-quality platforms and environments for large, distributed applications. His experience delivering software in a variety of industries (including financial, consumer electronics, ecommerce, health care) has given him a strong background for helping Threat Stack clients build stable and secure platforms for their applications.