"innovative technology I think people should read about"

Docker-based FIO I/O benchmarking

What is FIO?

fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user. The typical use of fio is to write a job file matching the I/O load one wants to simulate. – (https://linux.die.net/man/1/fio)

fio can be a great tool for helping to measure workload I/O of a specific application workload on a particular device or file. Fio proves to be a detailed benchmarking tool used for workloads today with many options. I personally came across the tool while working at EMC when needing to benchmark Disk I/O of application running in different Linux container runtimes. This leads me to my next topic.

Why Docker based fio-tools

One of the projects I was working on was using Docker on AWS and various private cloud deployments and we wanted to see how workloads performed on these different cloud environments inside Docker container with various CPU, Memory, Disk I/O limits with various block, flash, or DAS based storage devices.

One way to wanted to do this was to containerize fio and allow users to pass the workload configuration and disk to the container that was doing the testing.

The first part of this was to containerize fio with the option to pass in JOB files by pathname or by a URL such as a raw Github Gist.

The Dockerfile (below) is based on Ubuntu 14 which admittedly can be smaller but we can easily install fio and pass a CMD script called run.sh.

What does run.sh do? This script does a few things, is checked that you are passing a JOBFILE name (fio job) which without REMOTEFILES will expect it to exist in `/tmp/fio-data` it also cleans up the fio-data directory by copying the contents which may be jobs files out and then back in while removing any old graphs or output. If the user passes in REMOTEFILES it will be downloaded from the internet with wget before being used.

There are two other Dockerfiles that are aimed at doing two other operations. 1. Producing graphs of the output data with fio2gnuplot and serving the graphs and output from a python SimpleHTTPServer on port 8000.

All Dockerfiles and examples can be found here (https://github.com/wallnerryan/fio-tools) and it also includes an All-In-One image that will run the job, generate the graphs and serve them all in one which is called fiotools-aio.

Other Examples

Important

Your fio job file should reference a mount or disk that you would like to run the job file against. In the job fil it will look something like: directory=/my/mounted/volume to test against docker volumes

If you want to run more than one all-in-one job, just use -v /tmp/fio-data instead of -v /tmp/fio-data:/tmp/fio-data This is only needed when you run the individual tool images separately

Notes

The fio-tools container will clean up the /tmp/fio-data volume by default when you re-run it.

If you want to save any data, copy this data out or save the files locally.

How to get graphs

When you serve on port 8000, you will have a list of all logs created and plots created, click on the .png files to see graph (see below for example screen)

Testing and building with codefresh

As a side note, I recently added this repository to build on Codefresh. Right now, it builds the fiotools-aio Dockerfile which I find most useful and moves on but it was an easy experience that I wanted to add to the end of this post.

Navigate to https://g.codefresh.io/repositories? or create a free account by logging into codefresh with your Github account. By logging in with Github it will have access to your repositories you gave access to and this is where the fio-tools images are.

I added the repository as a build and configured it like so.

This will automatically build my Dockerfile and run any integration tests and unit tests I may have configured in codefresh, thought right now I have none but will soon add some simple job to run against a file as an integration test with a codefresh composition.

Conclusion

I found over my time using both native linux tools and docker-based or containerized tools that there is need for both sometimes and in fact when testing container-native application workloads sometimes it is best to get metrics or benchmarks from the point of view of the application which is why we chose to run fio as a microservice itself.

2 thoughts on “Docker-based FIO I/O benchmarking”

Thanks. Sure the OCI is a great foundation, the past two companies I worked for are members and the basic objective is to help standards in the container technology industry. Some good efforts already than can be seen are RunC and Containerd from Docker, rkt from CoreOS etc where container APIs are standard driven by OCI.