Tuesday, June 3, 2014

ZIP format is one of the most popular compression mechanism in computer world. A Zip file may contains multiples files or folder in compressed format. Java API provides extensive support to read Zip files, all classes related to zip file processing are located in java.util.zip package. One of the most common task related to zip archive is to read a Zip file and display what entries it contains, and then extract them in a folder. In this tutorial we will learn how to do this task in Java. There are two ways you can iterate over all items in a given zip archive, you can use either java.util.zip.ZipFile or java.util.zip.ZipInputStream. Since a Zip file contains several items, each of them has header field containing size of items in number of bytes. Which means you can iterate all entries without actually decompressing the zip file. ZipFile class accepts a java.io.File or String file name, it opens a ZIP file for reading and UTF-8 charset is used to decode the entry names and comments. Main benefit of using ZipFile over ZipInputStream is that it uses random access to iterate over different entries, while ZipInputStream is sequential, because it works with stream, due to which it's not able to move positions freely. It has to read and decompress all zip data in order to reach EOF for each entry and read header of next entry. That's why its better to use ZipFile class over ZipInputStream for iterating over all entries from archive. We will learn more about how to use read Zip file in Java, by following an example. By the way, code should work with zip file created by any zip utility e.g. WinZip, WinRAR or any other tool, .ZIP format permits multiple compression algorithms.. I have tested with Winzip in Windows 8, but it should work with zip file created by any tool.

Reading Zip archive in Java

In this example, I have used ZipFile class to iterate over each file from Zip archive. getEntry() method of ZipFile returns an entry, which has all meta data including name, size and modified date and time. You can ask ZipFile for InputStream corresponding to this file entry for extracting real data. Which means, you only incur cost of decompression, when you really need to. By using java.util.zip.ZipFile, you can check each of entry and only extract certain entries, depending upon your logic. ZipFile is good for both sequential and random access of individual file entries. On the other hand, if you are using ZipInptStream then like any other InputStream, you will need to process all entries sequentially, as shown in second example. Key point to remember, especially if you are processing large zip archives is that, Java 6 only support zip file up to 2GB. Thankfully Java 7 supports zip64 mode, which can be used to process large zip file with size more than 2GB.

In order to run this file, make your you must have, zip file with name pics.zip in C:\temp, and output directory C:\temp\Images available, otherwise it will throw java.lang.NullPointerException. After successful run of this program, you can see contents of zip file extracted inside output directory. By the way, as an exercise, you can enhance this program to get name of zip file from user and create output directory of same name.

That's all about How to read Zip file in Java. We have seen two different approaches to iterate over each file entries in Zip file and retrieve them. You should prefer using ZipFile over ZipInputStream for iterating over each file from archive. It's also good to know that java.uti.zip package also support GZIP file formats, which means you can also read .gz files generated by gzip command in UNIX from your Java program.

@Of al..., yes you can compre more than one text files using tihs API. It allows you to create a ZIP file from a directory, which may contain sub-directories. You can even ues WinZip, WinRAR and 7-Zip to open Zip files created in Java. In UNIX, you can use zip command by it's own or gunzip command.

I have an issue here, I was trying to create a big zip file to backup lots of images and video files using Java but I am getting java.lang.OutOfMemoryError again? I tried to increase heap space but it didn't help, any idea?