Logging for CoreOS and Kubernetes: How Containerization Saved the Day!

It seemed a simple request: store the system logs from our EC2 instances living in AWS. Seems pretty fundamental. After all, Amazon already provides an agent for shipping logs to their API. This should be a no-brainer. A slam-dunk!

Not so fast! Our instances run CoreOS, part of our Kubernetes infrastructure, and nothing was showing up. Need to get creative…

Things get interesting

On a more traditional Linux based system it would be a simple matter of running the CloudWatch agent which would tail one or more log files sending that information to AWS via the CloudWatch APIs.

CoreOS is a minimal OS designed for running containers and it comes with Docker pre-installed as a systemd service. This, along with its focus on security and reliability, is a large part on why we choose CoreOS for running our Kubernetes Infrastructure - more on this in a future blog post. As part of this minimal design, CoreOS does not provide a package management system, like yum or apt, and does not come with a traditional syslog daemon.

Accessing the logs

If you’re familiar with CoreOS you will know that CoreOS uses systemd’s logging system journald for managing logs. The journald daemon stores the log information in a binary format. A quick look at the docs for journald and journald.conf shows some configuration options - none of which gave us access to the logs in a format required. CoreOS does provide the journalctlutility for accessing the machine’s journal/logs.

With this information the obvious solution is to use journalctl. The journalctl utility has the ability to tail the systemd journal with the -f flag: journalctl -f. Using this approach we can pipe the logs somewhere. But where? One option would be to pipe them to a file and then have the CloudWatch agent read the file that way. However without a package-management solution installing the CloudWatch Agent is not a simple solution. Plus we want something that is easy to update, and would work with our existing container infrastructure. We need something else…

Sending the logs

We are big fans of Docker at InVision and it’s only natural to look for a solution running in a container. A while back, Amazon announced its Container Service, called ECS. They posted a blog article about sending container logs to CloudWatch “Send ECS Container Logs to CloudWatch Logs for Centralized Monitoring”. They were also kind enough to provide a working example that someone had already ported to CoreOS: cloudwatchlogs.

This implementation involves running rsyslog and the CloudWatch agent in a container. rsyslog listens on a port (in this case 514) writing the sys logs it receives to the container’s filesystem. The CloudWatch agent reads these files sending the logs to the CloudWatch API. These are all managed by a supervisord process manager. We modified this implementation a bit internally - mostly to allow setting the Log Group name via an environment variable and changing the log file location.

Now we have somewhere to send the output of journalctl -f, but how to get them to the container? There just so happens to be a great utility already installed on CoreOS that works great for this situation, ncat. Ncat is a small utility for reading and writing data across a network. Using ncat allows us to do the following: /usr/bin/journalctl -o short -f | /usr/bin/ncat 127.0.0.1 514.

Putting it all together

The CloudWatch Agent requires permission to write to the CloudWatch Service. You will need to grant the AWS User Account the following permissions:

CoreOS uses cloud-config for its instance setup. We make extensive use of cloud-config for the initial configuration of all our EC2 instances. It’s a simple matter of adding these as systemd services to the cloud-config script.

First we need to configure the CloudWatch container to listen for logs. Since this is a Docker container we want it to run after the Docker service has started.

For our worker nodes that are running as part of our Kubernetes infrastructure, the cloudwatchlogs.service can be deployed as a DaemonSet. This allows kubernetes to manage the lifecycle of the Cloudwatch log service versus systemd and fits in well with the rest of our application infrastructure.

Second we need to configure a service to send the logs to the container. It is just a matter of implementing our journalctl command as a systemd service unit. We want this to start after the cloudwatchlogs.service has started.

If you prefer to manually install these to test it out, you can save the content of each of these services to individual files (in this example journalctl-output.service and cloudwatchlogs.service) and place them in the /etc/systemd/system/ directory. Make sure permissions are set correctly and then enable them like so:

And That’s It!

You should start to see your logs showing up in the AWS CloudWatch Logs console. Note: logs will be grouped by the defined awslogs and then logged under each instance name. You can modify this by editing awslogs.conf file.

It’s important to note that this implementation does not guarantee that 100% of the journald logs will be sent to CloudWatch. While the container will be restarted if it crashes (supervisord will manage individual processes in the container), due to timing issues some logs may get lost. We have been running this in Production for a few months now and have not had any issues. The overall system impact is pretty minimal and the combined services seem stable.