RedShift Mega Maid

This project is a simple Docker container built from awslabs/amazon-redshift-utils. It's purpose is to VACUUM and ANALYZE a RedShift Cluster as a background job.

Development

This project really depends on awslabs/amazon-redshift-utils. If you want to fix bugs & make changes to the script, fork that repo & send Pull Requests upstream!

Building

This container uses a single Makefile for ease of building & deployment. It also uses a basic "Build Tools" container that includes the standard gcc + make toolchain. This should make running in any given CI environment that supports Docker easy.

make container-build

This spins up a clean docker container, mounting in the project directory, and calls make clean build.

make build

This runs docker build to install the script's dependencies & build the container.

make package

This uses Dockerfile to build the releaseable docker container.

make deploy

Deploy

We're tagging our images based on date and git hash, and also with latest. The container should now be in the ECR (you can get the list of images in ECR with aws ecr list-images --repository-name redshift-mega-maid).

Running

TODO: Kubernetes Deployment option

The container takes environment variables to pass to the analyze-vacuum-schema.py script as arguments. Most are specified in Dockerfile as defaults but can be overridden. As a general rule, the environment variable names follow the convention: $MM_ARGUMENT_NAME, where the cooresponding command line argument would be --argument-name. Prefix MM_ to the argument name, while replacing all dashes (-) with underscores (_).

Working with the Container

Logs

Within the container the logs are simply printed to stdout via /dev/stdout. To override this, set MM_OUTPUT_FILE to something else. If you wish to persist logs, you will want to pass in a volume mount for the container to write the file to with -v.

Since we're using an ENTRYPOINT, rather than CMD in Dockerfile any flags you pass to docker run after the image will be appended after the ENTRYPOINT. This means that if you want to run other commands inside the container (e.g.: spawn a shell to interactively use other tools in awslabs/amazon-redshift-utils), you'll have to override the ENTRYPOINT with:

docker run --entrypoint='/path/to/your-entrypoint-here'

Getting into the Container

If for some reason you need to run a container and get a shell you need to override the entrypoint: docker run -it -entrypoint=/bin/bash <IMAGE>

License

This project is mostly packaging and wrapper scripts for the amazon-redshift-utils tools. As such, nothing in this repository is "novel", or "non-obvious". This repo is therefore released under the Apache 2.0 License.