Introduction

It seems there is no end to the porting of good C/C++ algorithms to .NET, and so here comes another one. This algorithm comes courtesy of Lasse Mikkel Reinhold, and is called QuickLZ.

You can read more about QuickLZ at the website, however I will outline some of the aspects of QuickLZ as I go.

What IS QuickLZ?

It comes as no surprise if you haven't heard of QuickLZ, it is a new compression algorithm particularily well suited for psuedo streaming (packet compression). It was originally released early November 2006, and updated to 1.10 later that month. The porting is based on 1.10, which as of the writing of this article boasts an incredible 20-30% improvement over other well known algorithms such as LZO1X and LZF. Complete benchmark information on the C implementation can be found at the above linked website, however here is an overall assessment:

QuickLZ 1.10

LZO 1X 2.02

LZP-1

LZF 1.6

ZLIB-1 1.2.3

Compressed Size

61.0%

60.2%

59.9%

60.7%

46.4%

Compression MB/Sec

148

81.8

60.4

60.9

7.45

Decompression MB/Sec

380

307

89.3

198

120

If you are reading this article, there is a chance you also read my article including a port of MiniLZO. Based on the numbers here, it's easy to see why LZO was my previous choice. However, QuickLZ is now 19.2% faster on average than LZO on decompression, and an astonishing 44.7% faster on compression. And approximately 0.8% less efficient on final compression. This large improvement by QuickLZ made it a prime candidate for porting to .NET and while it doesn't totally replace my minilzo port, it does give a better alternative for those who don't need the LZO format.

Things To Remember

While I was porting the code, which was incredibly clear and easy to port, I noticed a number of areas for improvement. This specific iteration of the port is a first revision and is based on the optimizations for better ratio working with data > 100kb in size. A more refined and "Minimalistic" version of the algorithm is possible, and may come as a complimentary article specifically regarding packet compression. Meanwhile, this class serves as a general use class for both files and packets that should outperform other managed options. I have not profiled this port to C# whatsoever, and I know that for simplicity of the public interface there is some additional overhead. If you must make the Unsafe versions public and use them, make sure you understand the code before you cry to me about it not working :)

A comment made during the port of MiniLZO was how the original uncompressed size was not easy to attain from the compressed data, so prepending the size was requested but would ultimately break compatibility with the original LZO1X implementation. Based on the standard implementation of QuickLZ, I have provided a public method to find the original uncompressed size quickly and painlessly.

Without Further Ado...

So let's look at how to use the safe(?) interface to the implementation. All source examples are psuedo, and assume knowledge of basic C# operations.

This little snippet shows the basic idea of obtaining some data, in this case loading it from a file, and passing it into the compression method, which takes a Byte[] for the buffer to read from, a starting offset within the buffer to start at, and the length from the starting point to include in compression. This is useful if you need to work with a packet, where packet headers may not be included in the compression/decompression routines. The return value is an exact size Byte[] stripped of the extra scratch space allocated during compression. It should be noted that QuickLZ imposes a requirement that during compression the destination buffer is required to be the length of the original data plus an additional 36000 bytes for temporary workspace. This implementation copies the temporary destination buffer into an exact size buffer for final return.

UInt32 size = QuickLZ.GetDecompressedSize(data, 0);

This little snippet here shows how you can manually obtain the size of the compressed data out of the Byte[] directly. The second argument, again, reflects where the actual compressed data headers begin. It should be noted that this method is never really needed, but is provided for completion.

byte[] decompressed = QuickLZ.Decompress(compressed, 0);

This last snippet is fairly self explanitory by this point. It takes a Byte[] for the compressed data, and a starting point once again. It returns an exact size buffer like Compress, however does not have the added overhead of copying from a temporary destination, as the exact uncompressed size is known ahead of time, which is why the GetDecompressedSize method is not required.

Conclusion

QuickLZ as a new alternative to the compression scene and seems to currently be the fastest within this category of compression. I hope you will find this port useful, and as always if you find any bugs or bottlenecks that degrade performance please let me know.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Well, I'm not sure if the original author has changes his own liscense. Otherwise I suppose it falls under the same liscense as the C code. Personally, this code has long since been abandoned when the original C author opted to adopt a .NET port of his code himself. I would recommend looking into his implementation.
However, if the original C author is fine with it, I'd be fine with this being public domain for the sake of benefiting the .NET community, if not directly then to adjust the algorithm for individual needs.

1) The table in the beginning of the article, showing QuickLZ to be twice as fast than LZF is misleading.
My benchmarks of compressing/decompressing using QuickLZ and LZF prove the author wrong:
I'm using the C# versions (managed only) of these two algorithms.

This is in contrast to the author's claims that QuickLZ is twice as fast that LZF and better at compressing!!!
To me it seems like LZF in its default C# port is almost identical in performance and much better in compression ratios.

NOTE: After some optimizations, I have made LZF do it in 156 ms, compressing to ~ 2.00 MB.
This means its faster than QuickLZ.

2) If you have licensing issues with the GPL, again, the LZF algorithm proves to be a better alternative:
By the way you can get it from here: http://dist.schmorp.de/liblzf/

It is licensed under BSD and GPL, you decide, so you can use it in binary form without open sourcing your project if u choose BSD.

First of all, the table in this article is comparing the C versions, not the C# versions. Secondly, your own benchmark may be correct for the 1.10 port in this article (ported by 3'rd party), however, QuickLZ 1.40 has also been ported (by original author) which performs much better.

I've made a small benchmark on an Athlon64 6400+ using the files on the QuickLZ website. Get the source at www.quicklz.com/lzfqlz.zip and try for yourself with more files

As I mentioned, I have modified the stock example and it produces better results.
Why do you keep posting results that use the original algorithm?
That is clearly slower and less effective than your algorithm.

Let me give you a hint: Re-write your algorithm in C++/CLI because it can be much faster.
Then, use it from C#. You'll probably get better results too.

You just wrote "So I can't back your claim that QLZ outcompetes LZF, as it has a worse compression ratio.".

Assuming that the official LZF C# version and your CLI version produces the same output, let's agree that it was clearly wrong. Any compression algorithm has anomalies like your .doc file. Stop comparing compression ratios based on just a single file.

As for speed, I won't argue against that your CLI version performs better than the official LZF C# version (although still slower than QLZ C#). I even have an old assembler version of 1.00 that outperforms the C version of 1.40 in speed...

It all depends on what you're compressing. For example, a video file compresses poorly because it's already compressed. If you're only compressing .doc files, then sure, why not go with LZF? But if you're compressing text files (or anything that is text encoded, such as .html and .xml files), which I think was what the first test was using, then you'd want to go with QuickLZ.

I tried this class to compress a stream of random bytes. I started with 3KB and tested random sizes all the way to 10MB and surprisingly not on a single occasion did the size decrease.

It seems to work better with higher sizes, from 3KB - 40KB it was almost doubling the size after "compressing" it, but somewhere close to 10MB it would just add a hundred bytes to it, in all cases it never compressed.

In all fairness, high entropy random bytes generally cannot be compressed. However, if you feel your data SHOULD be compressable, then please refer to www.quicklz.com, and see if the original C code does any better. If you achieve the same final results, then it's safe to assume that QuickLZ is not the algorithm for you. It should be noted that the algorithm QuickLZ uses does potentially bloat high entropy data beyond it's normal size due to having to keep a dictionary and header information.

If you have suggestions, please talk to the original author who may be able to assist in improving the algorithm for your needs. Generally speaking the algorithm works quite well and quickly, but as with most compression routines (ZIP inclusive), high entropy data (random, previously compressed, etc) will always have a very low compression rate, if not increase the size due to overhead in the compression algorithm itself. On a side note you might try my minilzo C# port to see if it suits your needs any better.

I'd like to also note that I'm not really supporting this project anymore, as someone else seemed to have been working on one as well and was achieving similar results to my unsafe code with his fully managed solution.

I happened to run across the QuickLZ algorithm about a week ago. I have a bad habit of rewriting somewhat obscure code in C# (good practice?) and so I spent a few hours writing up a static class. Today a friend pointed me to your article and so I thought that I would share my code. My version is a straight forward rewrite of quick.c but it does not use unsafe methods or pointers and could be used in a trusted environment.

The main thing holding back the decompression method is the memcpy_up method. I tried using Buffer.BlockCopy and Array.Copy but they really bork things up. The actual method call to one of those may be more costly than a short byte copy loop anyways. Most of the copies seemed to be 3 or 4 bytes long so I wrote unrolled versions for those cases which helped tremendously. I did the same for the FastRead and FastWrite methods as the JIT wasn't really optimizing away the switch statements. That turned out to be a negligible performance gain.

It is almost sad to think that I wasted 3 more hours of my life on code that I will never use. It is even more sad that I would never be able to use it because of the GPL license. Sad sad.

Anyways, take a look. I would love to hear your thoughts and suggestions. I haven't taken the time to do any benchmarks (just basic profiling) but I would love to see how it stacks up against the unsafe code.