Containers: Security and Speed

So I'm Jake, I work for CoreOS, I'm a product manager, I run the Quay.io team here in New York. I always forget to say this at the end of my talks so we're hiring if you're interested we're looking for back end and front end people. Come find me at our booth. So I'm here to talk a little bit about containers and security and speed.

I don't really have enough time to go into depth on either of those talks, so gotta go fast is not only the subtitle of my talk but also the presentation and talking speed. So in case you're unfamiliar with Quay, this is a screenshot that I took yesterday of my Quay dashboard. Quay is a container image registry. You can think of it as like the analogy is github stores your source code, Quay stores the contaner images which are the immutable binary artifacts which come out of your source code that you actually end up deploying to your infrastructure.

We're a container image registry ++, and I just put the ++ on there because we're like Docker hub + a whole slew of features that we've layered on top of there that are really important for enterprises and corporations to be able to run their infrastructure, and to have trust in what they've built and delivered. I'm going to touch in one of our cooler features like for example time machine is a feature where you can roll back a push that was unintended or that had negative consequences. I'm also going to talk about flattened images, that's the speed in the speed and security of my talk, and I'm going to talk about security and vulnerability scanning and that's obviously the security in the speed and security talk.

We also have an on premise version of Quay, called Quay Hosted, and you can think of that, some of you may have pulled down the open source docker registry and may be running it in your infrastructure now. You can think of Quay Enterprise as that plus all of the things from the previous slide including comprehensive UI, we also have Enterprise auth connectors such as ldap, oauth2, we have a JWT plug in framework where you can connect to your own wacky auth service that isn't ldap, if you're running one of those.

We also have the capability to automatically replicate your images around the world. That's really important for people who're running multiple data centers for example. And we are also going to start shipping the security scanning tool, which I'll talk about on premises. That's something that our competitors plan to never ship, so that's a really huge thing that you'll see why that's so important in just a minute. In addition to being a place to store all of your images, though we're also kind of a place where you get to centralize all of your problems. So one of the problems I'll talk about is efficient distribution. So the average size of a docker container image is in the hundreds of megabytes range, right? That's obviously not ideal, and people are doing work great to try and slim those down with smaller base images, or building from scratch but that's the average that we see on the images stored on Quay today.

So the problem becomes when I need to deploy, here, I'm showing an example deploy where I'm deploying to 16 hypothetical machines. So the problem is how do I efficiently get this several hundred megabyte image down to these machines? So the registry protocol is just HTTP or HTTPS, and if anybody here has dealt with sending several large binaries to several machines, it can actually be kind of a tricky problem.

So we actually have to send this layer, this image data down to each one of the machines and because Docker is composed of layers, we have to actually send several large binary blobs per machine. Of course this isn't the whole picture, so we're horizontally scaled so this is a little bit more what it looks like but we've actually done some work to improve the performance even beyond the standard Docker registry protocol.

So just a quick primer on layers in case anybody is unfamiliar, so a Docker container image is composed of independent layers that get built up as you run these additional commands and it may be surprising to people to find out that each of these layers is transmitted independently during a deploy or a distribution phase and the image layers are append only. So I have an example, this is just just an example Dockerfile which builds up a container image, I'm starting from a base Ubuntu image which is extremely common, then I'm running a script which adds our build system, then I run a script which actually builds our application, so this is generating the binary that I actually care about that I wanna get out onto my machines and then I run another script which removes the build system.

Obviously this isn't what our actual build looks like but it's very similar in spirit to what we do. So the somewhat surprising thing to people who are unfamiliar the way Docker actually works and the way this images gets composed is that in the last layer when I remove the build system it actually does nothing to alleviate the burden of distributing this binary data to all of these machines.

So the build system is still delivered in layer two, and layer four just comes along and says, oh, by the way don't worry about what layer two did, cuz it's not really important. So if any of you are familiar with the registry V2 protocol which started shipping with Docker 1.5, the intention was to fix this independent distribution problem and the way that they intended to fix it was to pull all of the layers in parallel.

This is a great idea if you actually end up being network bound and if pulling layers in parallel is actually your limiting factor. So I decided to actually test. How effective was this? So this test came from doing a test deploy to 50 EC2 M3 medium machines and I did a V1 pull on a fleet and then I did a V2 pull on a completely separate fleet so there is no interplay, there is no disk space problems that are happening here and if you can see, the median pull time actually only changed by about a second or so.

This was actually so surprising to me that I went back and I hand checked all of the machines to make sure that the V2 protocol was active when it should have been and the V1 protocol was active when it should have been, and the data just didn't lie. Now I will say that these are EC2 machines and the disk is stored on the network, so these could actually just be disk bound or network disk bound as it were.

So that's something that I'll get back to after I talk a bit about the feature that we've implemented to alleviate this. So so one idea we had way back, and I mean way back. This shipped in one of the first versions of Qua was to actually flatten the images, and what that means is we would take a look at all of the different layers that compose your container image and we would look at what are the things that actually need to be in the final product in order for it to work properly? So when we ran that fourth step which removed our build system, we actually decided oh look they added the build system here and they removed it later on so let's just not ship the build system in the final container image.

So we do this today for our Quay deploys. Quay isself hosted, we run Quay off of Quay, obviously. So our 284.1 MB binary, this is the compressed size that we send over the wire, when it's flattened actually goes down to 224.5 MB. That's a savings of around 20%, so we would expect to see around 20% savings if we were network bound or if we were network disk bound when it came time to do a deploy so I spun up yet another stack of 50 machines, and pulled a flattened image instead of either using the V1 or V2 registry protocols, and the data was astounding. So we actually saved more than 50% by compressing all of these layers down into a single layer.

I'll leave it as an exercise for the reader, why it's actually better than the 20% savings that we get. But some places to consider looking for those performance benefits are the fact that we're only doing one large HTTP request and if people are familiar with TCP and HTTP, there's actually a huge benefit to doing one large request versus 20 smaller requests and then assembling all of that stuff on disk.

And so the final feature that I'll touch on is the way we do security flaw management or as I like to call it, vulnerability whack-a-mole. So I'm sure you guys recognize some of these cutsie icons that we've developed for the latest horrible attacks that affect all of our infrastructures, I don't if making a cutsie icon for it makes the attack more or less scary.

Should probably be more, but this is the way we used to consider the split between the operating system and the Ops people that work in our companies and the actual developers. So as an example, if I'm a developer and I'm writing an app that depends on a specific version of Java, I would traditionally tell the Ops team, hey I need Java Version 7 on all of the machines that my application is going to live on and, oh by the way, can you make sure that those machines are running all of the latest security patches that come from the vendor? The vendor in this case, the operating system vendor would be canonical for Ubuntu or Redhat for REL and then the Ops team would actually go through it and they'd make sure that all those machines were updated and that they were running non-vulnerable versions of all of the base software that my application was built on.

But containers changed all of that, so once we started building things from these common base images and we started bundling all of our dependencies into these immutable container images, and the linux kernel became our only shared piece of infrastructure, we actually found that the vulnerabilities were being pushed down into the container images where traditionally they would be centrally managed by an Ops or by our OS vendor team.

And this might not seem like a problem at first, but the problem is when you're running hundreds of microservices and each one of those is composed of hundreds of packages you have to know which package which contributes to which service actually is susceptible to a vulnerability, so in the case package A which presumably is open SSL, is vulnerable to heartbleed and packages B and C or B and D are vulnerable to their own set of CDEs or their own set of exploits so we, the security minded folks at CoreOS decided what can we do to help developers make sure that their packages and their container images stay up to date.

So we developed an automated system to detect vulnerabilities in container images, so what this means is every time you push a new image to Quay.io, we check it against all of the known vulnerabilities that we're tracking from all of the OS vendors that contribute these base images.

Additionally, we index all of the packages that these container images are composed of, and later on when new vulnerabilities are discovered, we're able to retroactively go back and find the container images which are vulnerable to the newly discovered vulnerabilities. And then of course we have a system to notify the developers when these things happen right, so, if we know it, it does nobody any good if we don't share that information, and we do all of this without running anything, and without re-scanning anything when new vulnerabilities are discovered.

It's an open source project, it's github.com/coreos/clair. We have a variety of partners who are integrating clair into their own offerings, we think this is great for the community in general because as the software gets better, total number of vulnerabilities go down, and really at the end of the day that's CoreOS's core mission, to secure the internet.

So we're happy if everybody integrates this into their stack. So this is just an example, all of the source control is on the left, that's where you might host your code, your code gets pushed or built by Quay. Quay sends the binary images off to our security vulnerabilities scanner clair, clair reports back whether things are vulnerable or not, and then we send off notifications using the notification system built into Quay for which we have a variety of different notification techniques, slack being the common one in our company, email, web hooks, if you need to build your own shim to your own notification services.

And we have self hosted notifications in Quay as well. This is just an example of a notification that we got in the Slack channel, when we posted a known vulnerable image into Quay, and this is just a test but this is exactly what we would have gotten if one of our production images was found to be vulnerable.

And you may think, is this actually really a problem, how often does this actually happen, luckily since we've indexed all of the vulnerable images on Quay we can actually answer that question. So we ran an analysis across the millions of images that we're hosting on Quay. The first thing we looked for was ghost which is a gethostbyname vulnerability CVE-2015.

This has been out for over a year we wanted to find out how many images are still susceptible to ghost oops 66.6% of all the images that we're still hosting on Quay today are susceptible to ghost. How about heartbleed. It's been almost two years heartbleed it can't be that common people must have patched it by now, oops.

So 80% of all the images on Quay are still vulnerable to heartbleed. Now this maybe more shocking than the actual reality. A lot of these images may not actually be currently deployed in production, but some of them certainly are. In the future we want to actually build graphs to track this level of total vulnerability across all of the images that we index and manage.

But we haven't built that capability yet. So finally, I hope that I've conveyed a little bit about why security and speed are so critical to an efficient stable container infrastructure and why you should think of we at Quay as your partners for managing your container images. We're definitely trying to push the envelope for distribution speed.

We've got some new features that we're hoping to ship before the end of the month. If anybody wants to know what those are, you can come and talk to us at our booth outside. In additionally we are helping to make sure that your software stays safe over time and that even when you've developed a service and it's running in production for months, it still stays safe and its still stays not vulnerable and like everything else that CoreOS offers, this is available on prem in your own data centers, deployed using containers.

So thanks, that's all I had, if you have any question for us today, you can come visit us at our booth out back and if you think of any questions tomorrow we are reachable in a variety of ways, you can send us an email, you find us on freenode, or you can send us a tweet, and that's all the time that I had, thanks.