Sites

Recently I was asked to help out when we had a need to reduce the size of a directory of rolling backups. And when I say size, I am talking around 12GB a day backup sizes. This was already a compressed directory, but I found that if we zip up the folders, we can get even better compression (to the tune of 60+% more room).

First note: Windows built-in zipping utility is not so good, it won't even try to compress anything that size. Instead it throws an error. Nice.

Basically I started working with two options, #ZipLib (SharpZipLib) or 7-Zip. #ZipLib is written entirely in C#, which lends nicely to logging and reporting purposes, but in my experience it may not be the answer for large file sets (NOTE: I benchmarked with #Zip's ZipFile, not with FastZip, which might be faster with the same compression results). 7-Zip is a fast utility, but it is an external application, so you have to call it from within your code and wait for the exit code. I realize this already gives 7-Zip an unfair advantage, but I needed the best tool for the job, so the comparison really is about which you should use when you are working with large file sets.

Note:7-Zip has a Stand Alone version (7za.exe) that I am talking about from here on out when I mention 7-Zip. They are on version 4.57.

Both tools compressed a 2GB directory (500MB on a compressed drive) down to a size of 185MB in ZIP format. #ZipLib took nearly an hour, so I didn't let it finish with the other directories. 7-Zip took a little over an hour and a half to compress all 12GB. I am presenting the code for both, so if anyone sees a faster way to use #ZipLib, I am definitely open to suggestions because I can get better logging about what is going on while it is running than I can with 7-Zip.

With 7-Zip we are calling an external application so we need a static class that will make calls out to the application and give us an integer exit code that we will convert to the 7-Zip Exit Codes above.

One thing you may have noticed is an interface IFileSystemAccess. This is my step between the code and the file system to be orthogonal.

This will quickly introduce zipping into your application with logging (using one of my biggest loves: log4net). I am serious about feedback for #ZipLib. If you have found FastZip to be extremely fast, I would love to hear about it (and may explore it then).