SAS 9.4 Maintenance 5 includes new support for reading and writing GZIP files directly. GZIP files, usually found with a .gz file extension, are a different format than ZIP files. Although both are forms of compressed files, a GZIP file is usually a compressed copy of a single file, whereas a ZIP file is an "archive" -- a collection of files in a compressed virtual folder. GZIP tools are built into Unix/Linux platforms and are commonly used to save space when storing large text-based files that you're not ready to part with: log files, csv files, and more. The algorithm used to compress GZIP files performs especially well with text files, although you can technically GZIP any file that you want.

You don't have to explicitly expand a compressed text file in order to read it with SAS. You can use the GZIP method to read and parse a .gz file directly, similar to the zcat command that you might be familiar with from the Unix shell:

If your file is in a binary format such as a SAS data set (sas7bdat) or Excel (XLS or XLSX), you probably will need to expand the file completely before reading it as data. These files are read using special drivers that don't process the bytes sequentially, so you need the entire file available on disk.

Note: Because each GZIP file represents just one compressed file, the MEMBER= option doesn't apply. When dealing with ZIP file archives that contain multiple files, you could use the MEMBER= option on FILENAME ZIP to address a specific file that you want. My recent example about FINFO and file details relies heavily on that approach. However, the GZIP option and MEMBER= options are mutually exclusive. In that way, it's much simpler...just like its Unix shell equivalent.