Twistlock’s Daniel Shapira speaks with Docker’s David Lawrence on the Modern Software Supply Chain at Dockercon EU 2017.

Transcript

What is a software supply chain?
I’ll use a metaphor to describe a software supply chain: A restaurant.

Restaurants start with a chef who comes up with algorithms, like different recipes.

Chefs source their information from works of art such as “mastering art of french cooking”.
The output is a recipe – human readable code that can be used to form a process, like building a plate of food.

A recipe book is a version control system for the restaurant – because it is the first tangible thing it is the first component in our supply chain.

The output is then passed off to our Continuous Integration (CI) service – the line cooks building the plate of food over and over again.

If CI is working property, given the same recipe and same dependencies (ingredients) – it will produce identical plates of food the every time.

Having created the plates, they put them into the distribution center (the pass) and they sit there until the final piece of the supply chain (the server) pulls them and deploys them to the user.

4 steps/components to the supply chain
1. Version control system – where developers store code or where chef stores recipes
2. Continuous Integration service – takes code or recipe and builds out the applications over and over again
3. Pushes them into distribution centers which – for Docker this is a docker registry, whether its docker hub (open source registry), or you’re a Docker enterprise user
4. Servers – take the artifacts out of our distribution center and deploy them where relevant

Why do we care about software supply chain security?
It’s easy to focus on applications that are running and securing those – we talk a lot about secrets and crypted overlay networks. However, It is also critical to secure that entire process that got to that application being created. Imagine in our restaurant – if our food is contaminated at build time, our line cooks are taking it out and using it for the food, that contamination is carried through the rest of of the supply chain. So, the earlier in the supply chain something gets affected, the more pieces of your supply chain can potentially compromise.

Look at the attacks that have happened on software supply chains in the last decade or so, we not only see that they’re common, but they’re actually increasing in frequency. They used to be once or twice a year, but in 2017 there had been 6 already, and one or two more since the slide was put together.

Software supply chain threat model
One of the ways to build a threat model is to actually break down your application into 3 sets of components.

The first is the entry points – How can someone actually interact with the components within your system? Because this is how they will actually affect operations.

For our Version Control System, these are the inputs that get into our code and the API for whatever our Version Control System is.

For our CI service this is build time dependencies. This includes things like base images, but also software packages you’re going to be installing from linux distributions and APIs that enable you to trigger build within your version control systems. Build time dependencies are important because if one of those is compromised, you’ve already introduced vulnerabilities into your supply chain and applications early in the process.

For the distribution center, the only ways to interact with it is simple API (whether its in the restaurant or a docker registry). There are only two operations you can do against it, you can put something into it or you can take something out of it. That’s it.

Finally we have servers. If you are deploying these properly you shouldn’t be allowing users to SSH in, and shouldn’t even have SSH set-up on them. You should be deploying using something like linux kit to give a minimal linux distribution, and you should be deploying docker on top of them – with docker being the only mechanism for interacting with your servers.

Next is assets – the tangible things you want to protect within in your system.

For our version control system, this is largely proprietary code.

For our CI service – if you’re utilizing it correctly you have infrastructure as code, so you have the actual build instructions for the CI service checked into your version control system, making CI a relatively stateless process. However, you probably have some secrets in there — Secrets that allow the CI service to access your proprietary code from your version control, and push the built artifacts into your distribution center. Speaking of the distribution center, in your docker registry the images are the thing you want to protect. This is really the only thing it intentionally stores.

On your application servers we have a much broader scope of assets – users’ personally identifiable information, secrets that allow your server to access other services within your system. Secrets are particularly important to protect because a lot of attacks, once they get a foothold, will move laterally within your infrastructure, which means moving to another service using information they attained, or pull permissions once they get past your initial firewall. Secrets, if not properly protected, can enable an attacker to further get compromises within your system and then compute. A lot of attacks these days are simply looking for compute resources that they can mine bitcoin or another cryptocurrency on.

Last part of the threat model input is the data flow – how does information really move through the system. This is where we really see that a compromise early in our system affects all the subsequent pieces. There’s no way to remove a compromise once its been introduced and moved through deployment. We’ll look at some tools later that will help with that.

Which component is the #1 concern today?
We’re all busy people. We have to focus our efforts on the thing that’s going to have the most impact. So, what is the #1?

Bit-rot software that has been introduced into your distribution center with vulnerabilities you haven’t discovered yet, and it sits there and bit-rots, and if it sits there long enough the vulnerabilities become exploitable.

So what’s the most vulnerable? Most attacks go after the distribution center. They’re not specifically focused on bitrot but do go after the part of the system that tends to be more front line – which is probably one of the reasons it gets attacked more – its more exposed.