unzip

As much as we all love Linux, it is
nevertheless true that occasionally we must force ourselves to deal
with the DOS/MS-Windows world, however indirectly. For some of us
that involves having a dual-boot system (perhaps via LILO—the
LInux LOader—or OS/2's Boot Manager), but even those of us who
manage to avoid that fate will sooner or later come across files
that originated on some flavor of DOS or Windows system. More than
likely, a few of those files will end in
.zip—and that's where the
unzip command comes in.

unzip is a free utility to process
zipfiles, as these things are generally
called. Zipfiles are actually archives of one or more other files,
almost always compressed to save disk space and/or transmission
time. In this regard they are similar to compressed
tar archives, which are those
files usually ending in .tar.Z,
.tar.gz or .tgz that one
finds on most Linux ftp sites and many CD-ROM distributions. One
major difference between zip files and tar archives: compressed tar
archives bundle all of the files together and then compress the
result as a single entity; zipfiles compress individual files, then
store them in the archive. This zip file method isn't quite as
efficient in achieving the maximal overall compression, but it does
allow you to list the archive's contents and to extract individual
files without decompressing the whole mess.

Listing

How does one actually use unzip to list an archive's
contents? The simplest way is with the -l option
(for “list”):

You have each file's name (on the right), its uncompressed
size, and the date and time of its last modification. For many of
us, however, especially those long steeped in the terse intricacies
of ls, this is a little
too short and sweet. For fans of ls, or for
anyone wishing to know more about the details of the archive, unzip
has an entire mode devoted to listing both useful and obscure
zipfile information: zipinfo mode, triggered via the
-Z option. (On some systems the
zipinfo command exists as a link
to unzip and is synonymous with
unzip -Z, but this is not true of Slackware
distributions as of this writing.) We'll limit ourselves to a
description of the default zipinfo listing format:

You will immediately recognize a certain resemblance to the output
of ls -l. The header line gives the archive
name, its total size, and the total number of files in it; the
trailer gives the number of files listed (in this case all of
them), the total uncompressed and compressed data size of the
listed files (not counting internal zipfile headers), and the
compression ratio. Here the ratio is quite poor, mostly due to the
fact that the largest file (QUAKE92P.1) is stored without any
compression. In the leftmost column are the file permissions. The
next column indicates the version of the archiver, and the one
after that is what tells us the files came from the FAT (DOS) file
system. Next are the uncompressed file size and a column indicating
which files are most likely to be
binary and which are probably
text. The next three columns note
the compression method used on each file; the time stamps; and the
full file names.

Extracting

Now that we know what files we have, how do we actually get
the files out? File extraction is as simple as typing
unzip and the file name:

Here we've omitted the .zip suffix; unzip
first looks for the file quake92p and, not
finding it, checks for quake92p.zip instead.
What if we wanted only the README.TXT file? No problem. Anything
(well, almost anything) after the zipfile name is taken to be the
name of one of the enclosed files:

Here you may notice a little snag. If you now edit this file
in Linux with an editor like vi,
you'll see what looks like ^M at the end of each
and every line. Or, if you view the file with a pager like
more, you'll discover that any
line uncovered by the --More-- prompt gets
erased immediately. These problems are due to the fact that DOS and
its successors store text files with two
end-of-line characters, CR and LF (a.k.a. carriage return and
linefeed, respectively, or ^M and ^J, or CTRL-M and CTRL-J), rather
than the more efficient single character (LF) used on all Unix
systems. So when a Unix utility—like an editor or a pager or a
compiler—looks at a DOS text file, it may behave a little oddly or
die altogether.

Fortunately there's a simple solution: unzip's
-a option. Originally a mnemonic for
ASCII conversion, the option these days is
used for all sorts of text-file conversions. As a single-letter
option it does its best to automatically convert files that are
supposedly text, while leaving alone those that are marked binary.
Be careful! zip and PKZIP don't
always guess correctly when creating the archive, particularly for
certain classes of MS-Windows files, and unzip's “text”
conversions are almost always irreversible. In
other words, don't extract with auto-conversion and then delete the
original zipfile without first making sure everything is Okay.
unzip does indicate which files it thinks are text when
auto-converting, however:

In this case everything worked as intended. If, for some
reason, zip marked a text file as binary and you want to force text
conversion, simply double the option: -aa.

But wait, there's more! The discriminating Linux user,
happily accustomed to a file system that not only preserves the
case of file names but also distinguishes between names differing
only in case, is not going to settle for a bunch of all uppercase
DOS file names in his or her directories. Enter the
-L option. If (and only if) the file came from a
single case file system like DOS FAT or VMS, unzip
-L will convert it to lowercase upon extraction,
thusly:

Trending Topics

Upcoming Webinar

Getting Started with DevOps - Including New Data on IT Performance from Puppet Labs 2015 State of DevOps Report

August 27, 2015
12:00 PM CDT

DevOps represents a profound change from the way most IT departments have traditionally worked: from siloed teams and high-anxiety releases to everyone collaborating on uneventful and more frequent releases of higher-quality code. It doesn't matter how large or small an organization is, or even whether it's historically slow moving or risk averse — there are ways to adopt DevOps sanely, and get measurable results in just weeks.