How I Reduced My Virtual Machine Disk Images By Over 75% With QEMU

Lately I've been wanting to replace the hard drive in my laptop with a nice fast second-generation Intel X25-m. Unfortunately, my disk footprint is just a little too high to comfortably fit on the disk I want to buy, so I have been looking to free up a little space wherever I can.

What I Started With

I had three test VirtualBox images totaling around 8.5 GB. I figured that since two of the images were running the same version of Ubuntu, that QEMU's qcow2 copy-on-write disk images would be able to save me a bit of room. My original three machines looked like this:

The first two were Ubuntu 8.04 servers, the last was running Ubuntu 9.04. I decided that it would be easiest to rebuild these machines from scratch and migrate over whatever data was needed.

Creating The Base Disk Images

In order to keep organized, I created a pair of directories to keep disk images in. One for the read only base images, and another for the writable images:

~/qemu/qcow-ro
~/qemu/qcow-rw

I created two temporary disk images to hold fairly stripped-down installations of Ubuntu Hardy and Jaunty server. I created the initial machines without swap space. I also made sure to install all the latest updates and remove any extra outdated kernel packages (they are surprisingly large!). I also made sure to install any software that I knew I would want to have available in all my future machines.

Once everything was installed, I needed to clean things up, and I think I made a small error here. I ran apt-get clean to remove any downloaded packages. I probably should have zeroed out the files with a command like shred -n 1 -z /var/cache/apt/archives/*.deb instead.

I also made sure to delete the ssh host keys (rm /etc/ssh/ssh_host_*). When I boot a new image, I make sure to run dpkg-reconfigure openssh-server to generate a new set of keys for the new server.

At this point I was left with two images that needed to be shrunk down. The compression of qcow2 images is only applied when the image is created; any later writes to the same disk image will be uncompressed. I used qemu-img convert to recompress the images.

This left me with two smaller base images:

total 834M
438M hardy-root.qcow2
397M jaunty-root.qcow2

From here I just had to create more qcow2 images using one of these two as the base. Specifying the full path for the base_image seemed to be important. For example:

To save a bit more space, I created an empty swap disk image and ran mkswap on it. The scripts I wrote for starting my QEMU machines will make a copy of my empty swap image as needed. They are deleted when the machine shuts down.

Squeezing Out a Bit More Space

After I finished loading all the data into my test images I decided to try to recompress the images using qemu-img convert. In one case, I saved about 200 MB, which was about 40%. In most cases, the images got bigger! Zeroing out files instead of deleting them probably would have helped quite a bit in this case as well.

What I Ended Up With

After setting up equivalents of my original three machines, my disk footprint was just under 2 GB. I was very happy with that result. I have added another server and I am now right around 2.5 GB. Here are the images I am currently using:

Overall, I am very happy with the savings in disk space. It is also less costly in both time and disk space to throw up a temporary test machine if I need one.

The biggest drawback is that QEMU, even with the KQEMU accelerator, is quite a bit slower than Virtualbox or VMware Server. Fortunately, the performance is more than acceptable for what I'm using it for. Your mileage may vary, of course.