When operating Cloud Custodian, it is highly recommended to treat the policy
files as code, similar to that of Terraform or CloudFormation files. Cloud
Custodian has a built-in dryrun mode and policy syntax validation which when
paired with an automated CI system, can help you release policies with confidence.

This tutorial assumes that you have working knowledge of Github, Git, Docker,
and a continuous integration tool (Jenkins, Drone, Travis, etc.).

To begin, start by checking your policy files into a source control management
tool like Github. This allows us to version and enable collaboration through
git pull requests and issues. In this example, we will be setting up a new repo
in Github.

First, set up a new repo in Github and grab the reporistory url. You don’t need
need to add a README or any other files to it first.

Next, enable a CI webhook back to your CI system of choice when pull requests
targeting your master branch are opened or updated. This allows us to continuously
test and validate the policies that are being modified.

This configuration will install Cloud Custodian and validate the policy.yml file
that we created in the previous step.

Finally, we can run the new policies against your cloud environment in dryrun mode.
This mode will only query the resources and apply the filters on the resources. Doing
this allows you to assess the potential blast radius of a given policy change.

Setting up the automated dryrun of policies is left as an exercise to the user– this
requires hosting your cloud authentication tokens inside of a CI system or hosting your
own CI system and using Managed Service Identities (Azure) or Instance Profiles (AWS).

It’s important to verify that the results of the dryrun match your expectations. Custodian
is a very powerful tool that will do exactly what you tell it to do! In this case, you should
always “measure twice, cut once”.

To run Cloud Custodian against your account, you will need an IAM role with appropriate
permissions. Depending on the scope of the policy, these permissions may differ from policy
to policy. For a baseline, the managed read only policies in each of the respective cloud
providers will be enough to dryrun your policies. Actions will require additional IAM
permissions which should be added at your discretion.

For serverless policies, Custodian will need the corresponding permissions to provision
serverless functions.

In AWS, you will need ReadOnly access as well as the following permissions:

Note: These are just the permissions to deploy Custodian Lambda functions, these are not
the permissions that are required to run Custodian _in_ the Lambda function. Those roles
are defined in the role attribute in the policy or with the assume role used in the cli.

Now that your policies are stored and available in source control, you can now
fill in the next pieces of the puzzle to deploy. The simplest way to operate
Cloud Custodian is to start with running Cloud Custodian against a single account
on a virtual machine.

To start, create a virtual machine on your cloud provider of choice.
It’s recommended to execute Cloud Custodian in the same cloud provider
that you are operating against to prevent a hard dependency on one cloud
to another.

Once you have Cloud Custodian installed, download your policies that you created
in the Compliance as Code section. If using git, just simply do a gitclone:

$ git clone <repository-url>

You now have your policies and custodian available on the instance. Typically, policies
that query the extant resources in the account/project/subscription should be run
on a regular basis to ensure that resources are constantly compliant. To do this you
can simply set up a cron job to run custodian on a set cadence.

For more advanced setups, such as executing Custodian against multiple accounts, we
distribute the tool c7n-org. c7n-org utilizes a accounts configuration file and
assume roles to operate against multiple accounts, projects, or subscriptions in
parallel. More information can be found in c7n-org: Multi Account Custodian Execution.

When policy files reach a sufficiently large size it can cause dryruns to execute for a
significantly long period of time. In most cases, the only thing that actually needs
to be tested would be the policies that were changed.

The following example will download the cloudcustodian/policystream image and
generate a policy file containing only the policies that changed between the most
recent commit and master.