This is the first in a series of short, ‘pragmatic’ cloud security posts. This time we’ll be covering AWS S3 Buckets.

Intro

AWS S3 is a very popular cloud based file / object storage service. Operating since March 2006, this service (along with AWS EC2) is a key contributor to the public cloud computing revolution. However, the number of the supported use cases and years of product evolution have made it challenging to understand, configure and properly assess data exposure in S3 buckets.

In this post, we’ll cover how to protect S3 buckets from human error that would allow our precious files to be read, modified or deleted by anyone on the internet. This blog post is very relevant and timely in light of recent news like this:

The focus here is on access and exposure rather than data encryption. Encryption is an important data privacy control that we will address in a future post.

Background: The S3 Configuration Plane

S3 buckets have multiple configuration options. It is important to know which of them are relevant to our threat model and which aren’t.

Here are some configuration properties that, while important, are not relevant to the threat model we are discussing here:

Website – S3 can host a (public) static site. While it looks relevant, this configuration does not control the actual exposure of our data.

Versioning – might be relevant from a data protection perspective but not from an exposure perspective

CORS – a security setting/concept relevant to browser access of static websites hosted from S3 and not from data exposure perspective.

Logging – important from a compliance perspective and could assist with the detection of potential rogue access patterns, but this will not prevent our data from being exposed.

Encryption – the basic encryption promise is that your files will be stored encrypted on AWS storage (which is a good practice), however unless you are using client side encryption (or server side encryption with customer managed keys) this data could still be read by any S3 user if proper access controls are not in place. While encryption is an important part of the S3 security model, it is out of scope of this post.

There are two main security policy mechanisms that are relevant to access control: Access Control Lists (ACL) and IAM Policies.

Access Control Lists (ACLs)

This original access configuration method was designed in the era where XML was widely used, but is now regarded as the ‘legacy’ approach. Still, there are use cases that can only be addressed through ACLs – namely:

ACLs can be applied to buckets and to the internal individual objects (files)

ACLs can grant permissions to users in other accounts.

IAM Policies

The newer JSON-based AWS policy language that can be applied to S3 buckets (as a resource policy) or to IAM entities (users, groups, roles).

At this point you may be asking yourself when to use each technology. The main dilemma here is whether to use resource-based policies (policies that are attached to buckets or objects), or rather to assign access policies to IAM entities – users, groups and roles.

Personally I prefer IAM entity policies since requests like these are routinely encountered:

“Hey, what data could be accessed / altered by this employee / team?”

“Make sure that the 3rd party user can only access the X/Y/Z files (logs/ financial / security data)”

Still, there are valid use cases for resource based policies in strict control / governance scenarios where we wish to assert a strong ‘prevention’ guarantee without it being tied to a user. For example:

“Allow this bucket to only be accessed from that VPC / subnet / S3 endpoint”

Another (confusing) detail is that it is possible to own a bucket where you do *not* own/ control the objects and their policies (remember objects can only have ACLs). It can be the case that you might not have the permissions to read the objects’ policies in your own buckets! (Do I hear a collective sigh? I thought so.)

This means that in order to verify that objects are not exposed we can:

Review each object in each bucket, hoping that we have permission from the object owner. We must do this in a continuous manner to mitigate for new objects. This makes little sense in my opinion.

Make sure to properly lock / define the access patterns in the *bucket* policy to prevent its objects from being more permissive than the bucket’s original intent.

With this background let’s jump into the steps to manage access protection for S3 buckets.

Getting Started with S3 Access Protection

1. Assess – Understand your existing state

Before we start – let’s have a quick review over our S3 exposure.

As we are mostly focused on preventing random people from accessing our data, let’s pinpoint these misconfigurations.

You can write an app/ script in any language to test that ; I’ll be using Dome9 Compliance Engine’s GSL (Governance Specification Language) as it is purpose-built for such tasks. You might author a rule that looks like this:
S3Bucket should not have (acl.grants contain [uri like 'http://acs.amazonaws.com/groups/global/%'] or policy.Statement contain [Effect='Allow' and (Principal='*' or Principal.AWS='*')])

Here we are testing that the bucket’s ACL config does not allow any global AWS group (like ‘all users’) while also testing the bucket’s policy against any ‘allow’ from any non-authenticated user (principal=’*’) or any logged in user user from random AWS accounts (principal.AWS=’*’). The last part is something that many AWS users are not aware of.

2. Remediate – fix issues and set proper policies in place

High level strategy:

Maintain positive, logical permission in IAM policies that are attached to IAM entities while preventing, at the bucket (resource) level, any unauthorized access to objects.

Let’s get even more practical with some concrete tasks:

2.1 Remove all ACL based ‘allow’ permissions

Do it for each bucket (use your favorite automation for this).

Next, verify that our system is properly configured:
S3Bucket should not have acl.grants

Interesting detail: removing all ACLs also removes access from the root user (which you should not use anyway). Still, IAM users with relevant permissions will be fine.

If you want to keep the root user’s permissions in place you can, instead, have something like:
S3Bucket should not have acl.grants contain [displayName != 'MY_ROOT_USER_NAME']

This is the phase where you’ll think and design which group of users should get access to which buckets. This should also cover machines’ roles.

To assist, AWS provides S3 bucket access logging and Cloudtrail S3 object activity tracking. These would support the effort of understanding who needs what access by looking at previous access patterns.

It is good to know that in many cases you do not need to provide people with permanent access to objects. In these cases you might prefer to create a pre-signed S3 request (mostly via some internal system) that will generate a link which allows temporary access to objects without any need for AWS identity / permissions on the consumer side.

Note that this protects against anonymous user access, but it does not prevent a user logged in to a random AWS account – I’m still looking for AWS IAM policy ninjas to assist with differentiating between our legitimate users and random AWS users.

To test this take an arbitrary file (object) and create some havoc by opening it for all users to read. Then try to access it as an anonymous user.

Protip: if you are using the AWS cli you could use the –no-sign-request flag.

This ‘firewall’ policy should prevent this.

For our software defined validation let’s use the following rule:
S3Bucket should have policy.Statement contain [Effect='Deny' and Principal = '*' and Action='s3:*' and Condition.Null.aws:userid = 'true']

Simple, right? What about buckets that should be public?

It is probable that some buckets are intended to be opened for public read. I have 2 recommendations:

Have a naming conventions that will allow *humans* to immediately see this bucket and know it is public.

Use S3 tags to denote a well-defined *machine enforceable* convention. Example: public-bucket:”true”. You can add additional tags for governance like ‘approved-by’ and ‘approval-time’.

Now it is trivial to exclude these buckets from such automated tests:
S3Bucket where tags contain-none [key=’public-bucket’ and value=’true’] should ...

Summary

The goal of this post is to provide you with new perspectives focused on controlling S3 bucket access. As a bonus we also briefly touched on software based verification of policies which can take your security and governance journey to the next level. I would like to personally thank Scott Ward from AWS for his great insights.