Programmatically Compress and Decompress Files

This lesson is very easy. This lesson focuses on how to compress and decompress files programmatically using .NET Framework and C# -or any language that supports .NET of course.

Currently .NET Framework supports two types of compression algorithms:

Deflate
This is a very simple algorithm, used for simple compression and decompression operations. Don’t use this algorithm except if you intend to use compressed data only in your application because this algorithm is not compatible with any compression tool.

GZIP
This algorithm uses internally the Deflate algorithm. This algorithm includes a cyclic redundancy check value for detecting data corruption. Also it’s compatible with most compression tools because it writes headers to the compressed file, so compression tools -like WinZip and WinRAR- can easily access the compressed file and decompress it as well. This algorithm also can be extended to use other algorithms internally other than Deflate.

Fortunately, whether using Deflate or GZIP in .NET, code is the same; the only thing that needs to change is the class name.

Deflate and GZIP can be found in the namespace System.IO.Compression that resides on assembly System.dll. This namespace contains only three types, the two algorithms implementation classes DeflateStream and GZipStream -both inherit directly from System.IO.Stream class-, and an enumeration CompressionMode that defines the operation (compression or decompression).

Compressing a file

The code for compression is very simple:

Code snippets in this lesson rely on the fact that you added two using (Imports in VB) statements, System.IO and System.IO.Compression.

Code explanation:
If you are going to compress a file then you must specify the CompressionMode.Compress option, and also you must specify a Stream that will the data be written to.
After creating the class you can compress the data by using its Write() method.

If you intended to use the Deflate algorithm, just change the class name to DeflateStream.

' VB.NET Code
Dim compressedFile As String = "D:CompressedGreatDocument.zip"
Dim originalFileName As String = "D:My Great Word Document.doc"
Dim zipFile As New FileStream(compressedFile, FileMode.Open, FileAccess.Read)
Dim originalFile As New FileStream(originalFileName, FileMode.Create, FileAccess.Write)
Dim alg As New GZipStream(zipFile, CompressionMode.Decompress)
While (True)
' Reading 100bytes by 100bytes
Dim buffer(100) As Byte
' The Read() method returns the number of bytes read
Dim bytesRead As Integer = alg.Read(buffer, 0, buffer.Length)
originalFile.Write(buffer, 0, bytesRead)
If (bytesRead buffer.Length) Then
Exit While
End If
End While

Code explanation:

First, we create a file stream to read the ZIP file, and then we created our algorithm class that encapsulates the file stream for decompressing it.

Then we created a file stream for the original file, for writing the decompressed data.
After that, we read the data compressed 100bytes after each other -you may prefer more- and write this data to the original file stream.

By encapsulating disposable objects into using statements you become assured that every object will be disposed in a certain point -end of the using statement- and no object will retain in memory.

Try it yourself

Develop a simple application that can be used to compress and decompress files.

This application can compress several files into one file and decompress them as well.

unfortunately, RAR itself is not a comression algorithm, at least winrar uses gzip, bz2, tar, etc. to increase comression ratio. If you mean that, you are right, 7-Zip and bz2 can be so powerful as rar itself.

unfortunately, RAR itself is not a comression algorithm, at least winrar uses gzip, bz2, tar, etc. to increase comression ratio. If you mean that, you are right, 7-Zip and bz2 can be so powerful as rar itself.