Tuesday, April 5, 2011

(Note : a new page is dedicated to benchmark results, with updated graph)
In the previous note regarding CompressionRatings results, a new method to compare compression programs has been suggested.
The main idea is that the relative "ranking" of a compressor may depend on the "speed" we want it to perform at. This representation is more complex, since a compressor is no longer purely "better" or "worse" than another one, as a simple ladder system.

With the ranking now being also dependent on the target speed, we need something better : a 2D graph has been proposed to visualize the relative performance of several compression programs. Although the proposed scenario and the figures it produces are successful in producing a winner for all speeds, the unfortunate side effect is that the differences tend to become visually small with slower speeds, since the timings to be represented become extremely large.

Therefore, i've been trying to produce a new ranking system, visually clearer, based on the same input and the same methodology.
We keep the initial scenario (compress a file, send it over a bandwidth limited I/O, and then decode the file). Total timings are calculated and compared.
The main new idea is that, whatever the selected I/O speed, the ranking always ends up with a value in a fixed range (say between 0 and 100), "fastest" alternative constantly being represented at the top of the chart.
All other compressors follow, at a relative distance which depends on time difference, effectively magnifying differences between compressors for better clarity, but keeping the ranking intact.

Below 6 MB/s, stronger compression alternatives win the competition, with 4x4 (-3) briefly leading at the 3-5 MB/s range, then losing it to Nanozip (-cd) and FreeArc (-m2) which are roughly equivalent.

A new contender also gets into the fray, with QuickLZ effectively re-benched using a more favorable strategy. With these new figures, it takes the lead in term of raw speed. However, remember that at such fast speeds, RAM Drive bandwidth is saturated. Therefore it tells more about the I/O code than raw compression performance, which is supposed to be the objective of this comparison.

In order to partly compensate I/O saturation effect, we can use another metric provided by the same benchmark, which is the CPU occupation ratio. When I/O is saturated, this value can be used to extrapolate what it could have been if there was no saturation. In order to not give an unfair advantage to this strategy, the extrapolation only ramps up to 350% (instead of 400% for a perfect occupation ratio).

It produces the following output :

Click on graph for enlarged display

Well, this is mostly the same graph as the previous one, with just LZ4 and QuickLZ swapped.
This is expected, since only 3 programs do qualify as saturating the I/O Ramdrive (QuickLZ, Zhuff and LZ4). The new lead from LZ4 comes from its extremely fast decoding speed which is between 2 to 3 times faster than QuickLZ. This can be effectively represented thanks to CPU occupation extrapolation, whereas I/O saturation does virtually erase this performance difference.

That being said, it is also worth observing that Qpress, the I/O code for QuickLZ, is making a very good job at extracting more bandwidth from the saturated RamDrive. Although arguably not strictly a compression related code, it is nonetheless a good inspiration for future improvements.

[Update] : a new benchmark page is now proposed on this blog, with an updated and improved graph.