LZOP

Note: LZO has been removed from Hadoop in version 0.20+, because the LZO libraries are licensed under the GNU General Public License (GPL). If you need to use LZO, the format is still supported, so you can download the codec separately and enable it manually in your Hadoop cluster.

Depending on your version of Hadoop, you might need to download the LZOP codec should separately and enabled the codec manually in your Hadoop cluster.

LZOP

Lzop is a file compression utility very similar to Bzip2. The LZO
algorithm is optimized for speed and does not compress as much. This can
be either an advantage or disadvantage, depending on whether you want speed or
space.

LZOP is Open Source software but is
copyrighted, and is distributed under the terms of GNU General Public License
(GPL).

Supported platform

Apache Hadoop distribution only.

Depending on your version of Hadoop, you
might need to download the LZOP codec should separately and enabled the codec
manually in your Hadoop cluster.

LZ4

LZ4 is a very fast and lossless compression
algorithm.

If you are using Hadoop on Windows, you can
also use the LZ4 compression algorithm through a command-line utility and a
small Windows application. In our tests
we used the command line tool and recommend it over the standalone application.

7Zip

7Zip is a powerful compression tool that we
highly recommend for use as a local compression processor. It is capable of
maximizing computation resources on the local computer, and the user can
configure the number of threads that can be used for file compression.

7zip provides several interfaces for
interacting with the compression tools:

·Command line interface

·Graphical interface

·Microsoft windows shell
integration

7zip has its own documentation, and it is
easy to learn how to use the shell to process files. To view the 7ZIP
documentation, open the directory where you installed 7zip, and look for the
help documentation (typically provided as standalone document or text file.)