Integrating into Qualtrics: Docker Deployment

About a year ago, Qualtrics made their first acquisition of a small company called Statwing. I was a part of the Statwing team and have worked on integrating the Statwing product into the Qualtrics platform over the last year. This post will cover what we changed to our application deployment when we joined Qualtrics.

As a smaller startup, we made some trade-offs with our architecture and deployment to move quickly, but resulted in some harder to maintain hacks requiring deep tribal knowledge. When joining Qualtrics, we expected we’d be migrating to a more mature deployment pipeline and that there might be more friction with some processes. Thankfully, we got the mature pipeline without a large amount of pain because Qualtrics used Docker, a tool for making and managing containers.

Deployment Model Before Qualtrics

The basics of what we were deploying: a standard web application server with background job worker servers. A CDN served our assets. A queue to trigger background jobs and a cache to save calculations from the background jobs.

As a small startup we wanted to quickly set up deployment and not have to worry about it once it was “working”. With that in mind, we deployed our application to AWS using a set of custom scripts.

First, we have a bootstrap script to save an Amazon Machine Image (AMI) from which we would start instances. If we had any package changes we needed to make, we would manually update them and create a new AMI.

Second, for general code deployment, we used the fabric library in python to run remote commands on our servers from our continuous integration server. The basic deployment used a hardcoded list of servers, pulled the latest version of our code onto those servers, and then restarted the web server/background worker process.

For application configuration, like a database URL, we injected environment variables, similarly to how Heroku uses them. We saved these configurations in encrypted files and uploaded them to our machines when they changed.

As a small startup, our deployment and configuration was quick to setup and easy to automate for most cases. We had some manual steps for creating AMIs but we didn’t update them often. Given the number of servers we were working with, we could handle hard-coding references to them in code. While this worked for our size, if we updated packages more often and maintained more machines in a more complex configuration, we’d hit some bottlenecks. Particularly, we’d hit these bottlenecks trying to scale to Qualtrics’ size.

Inside Qualtrics

Thankfully, Qualtrics uses Docker to create containers that can be deployed on the Qualtrics infrastructure. Transitioning our custom bootstrap scripts was pretty easy. We turned them into Dockerfiles by taking lines in our script and making them RUN commands in Docker.

Moving to Docker builds helped us keep a consistent environment for packages and removed our hacked together bootstrap scripts. We also did not need to manually save AMIs. Configuration and code deployment was simplified because we no longer needed a custom script to update code and restart processes. These steps are now handled by Docker with a pull and restart. Furthermore, every other team can quickly understand our process because it is standardized with Docker and there are no custom scripts to learn.

Another benefit came in our development environment setup. Using Docker allowed us to use Docker compose. Docker compose is a tool for defining and running multiple Docker containers. Now, our environment can be brought up in a way that better matched production. The setup for getting on-boarded on our team went from install multiple dependencies to get the code running locally, to just install Docker and run docker-compose -f compose.yaml.

Lastly, our previous deployment required a hard-coded list of servers. We gladly no longer have to keep track of that. Qualtrics uses consul, a service discovery and configuration service, and we were able to leverage that when we moved our service to Qualtrics.

Any pitfalls?

Our Docker containers were initially very large, about 1.2 GB. There are a lot of posts on the topic of large containers, as you can see: on StackOverflow, on RedHat’s blog, and this blog here.

Avoid multiple container layers if possible. i.e commands that build and remove artifacts should run in a single command

Build dependencies in a separate container and copy them to your final container

These tips decreased docker pull times, as the amount of data pulled in worst case went from more than 200MB to about 10MB. Less data means quicker Docker pulls means a quicker deploy.

Another learning was to ensure data that needed to be persisted outside the Docker container was mounted to an appropriate disk. Otherwise, that data is lost when the container is restarted. We solved this by adding the appropriate mounts as a part of our deploy configuration.

We also learned that running multiple processes on a container must be managed carefully. If you want to run multiple processes, set up a process supervisor that passes on kill signals to your child processes so processes can clean themselves up. Otherwise, it’s possible to create many zombie processes and cause unexpected behavior. Here is a good post describing the zombie problem on Docker more generally.

Conclusion

Docker helped us move to more mature build and deployment process with fewer hacks, and the move was relatively painless. With it, we were able to move quickly on building the Statwing product within Qualtrics. The goal of Statwing is to make statistical analysis efficient and delightful, and at Qualtrics there is much to do to make that happen. If this is interesting to you, send me a note at johnl@qualtrics.com and come join us!

John works at Qualtrics building data analysis products. Previously, John was a co-founder of Statwing, and is a graduate of Stanford University with a bachelors in Mathematics and a masters in Computer Science.

At Qualtrics, we build solutions that enable our customers to quickly gather, analyze, collaborate, and take meaningful actions with data that’s relevant to their employees, customers and markets. Here, you will get the backstory on how the people and processes within the engineering and product teams build solutions that make Qualtrics happen.