5.6.5 myisampack — Generate Compressed, Read-Only MyISAM Tables

The myisampack utility compresses
MyISAM tables. myisampack
works by compressing each column in the table separately.
Usually, myisampack packs the data file 40%
to 70%.

When the table is used later, the server reads into memory the
information needed to decompress columns. This results in much
better performance when accessing individual rows, because you
only have to uncompress exactly one row.

MySQL uses mmap() when possible to perform
memory mapping on compressed tables. If
mmap() does not work, MySQL falls back to
normal read/write file operations.

Please note the following:

If the mysqld server was invoked with
external locking disabled, it is not a good idea to invoke
myisampack if the table might be updated
by the server during the packing process. It is safest to
compress tables with the server stopped.

After packing a table, it becomes read only. This is
generally intended (such as when accessing packed tables on
a CD).

Each file name argument should be the name of an index
(.MYI) file. If you are not in the database
directory, you should specify the path name to the file. It is
permissible to omit the .MYI extension.

Produce a packed table even if it becomes larger than the
original or if the intermediate file from an earlier
invocation of myisampack exists.
(myisampack creates an intermediate file
named
tbl_name.TMD
in the database directory while it compresses the table. If
you kill myisampack, the
.TMD file might not be deleted.)
Normally, myisampack exits with an error
if it finds that
tbl_name.TMD
exists. With --force,
myisampack packs the table anyway.

Join all tables named on the command line into a single
packed table big_tbl_name. All
tables that are to be combined must
have identical structure (same column names and types, same
indexes, and so forth).

big_tbl_name must not exist prior
to the join operation. All source tables named on the
command line to be merged into
big_tbl_name must exist. The
source tables are read for the join operation but not
modified.

Wait and retry if the table is in use. If the
mysqld server was invoked with external
locking disabled, it is not a good idea to invoke
myisampack if the table might be updated
by the server during the packing process.

The following sequence of commands illustrates a typical table
compression session:

The number of columns containing values that are only
spaces. These occupy one bit.

empty-zero

The number of columns containing values that are only binary
zeros. These occupy one bit.

empty-fill

The number of integer columns that do not occupy the full
byte range of their type. These are changed to a smaller
type. For example, a BIGINT
column (eight bytes) can be stored as a
TINYINT column (one byte) if
all its values are in the range from -128
to 127.

pre-space

The number of decimal columns that are stored with leading
spaces. In this case, each value contains a count for the
number of leading spaces.

end-space

The number of columns that have a lot of trailing spaces. In
this case, each value contains a count for the number of
trailing spaces.

table-lookup

The column had only a small number of different values,
which were converted to an
ENUM before Huffman
compression.

zero

The number of columns for which all values are zero.

Original trees

The initial number of Huffman trees.

After join

The number of distinct Huffman trees left after joining
trees to save some header space.

After a table has been compressed, the Field
lines displayed by myisamchk -dvv include
additional information about each column:

Type

The data type. The value may contain any of the following
descriptors:

constant

All rows have the same value.

no endspace

Do not store endspace.

no endspace, not_always

Do not store endspace and do not do endspace compression
for all values.

If you do not see "Compressed" as the Row_format try issuing a "FLUSH TABLE tableName;" to force MySQL to reload the table and try again.

Posted by
Jim Grill
on
May 6, 2005

It seems there are some particular issues regarding "show table status" with compressed tables on a working server.

After further experimentation and some help from mysql support, it seems that the best method for compressing tables on a working database (even if you know the table will not be used during the process) is to first obtain a lock via a mysql client program and then flush the table. While leaving your client program connected and holding the lock, use the myisampack and myisamchk utilities per the above documentation.

When complete, release the lock and flush the table again.

Always do issue a "show table status..." and check the "Row_format" field. The Row_format should be "Compressed". If your table still shows something other than "Compressed" as the Row_format or if you are getting erroneous data from selects try issuing a "flush tables" statement.

Hope that helps. Have fun!

Posted by
J Jorgenson
on
January 23, 2008

Beware that packing a table while the database is running can lead to data corruption, and is unsupported by MySQL AB. However I've been able to successfully pack 100s of tables on a running database with only a minor warning in the end result. The minor warning is result from the table not being closed.

Ultimatly the results of table packing is to trade the bottleneck of Disk i/o for CPU cycles, by unpacking more records for the same sized block of data. We have reduced tables that take 1G down to 250M with myisampack.

Enjoy!

Posted by
J Jorgenson
on
January 23, 2008

Details to NOT miss about packed tables:1) Once you have Packed a MyISAM table is it **READ ONLY**. You can only Select From or Truncate the table. No updates or Inserts are allowed.2) The Archive Engine produces a smaller table, it won't have an Index, but new records can still be appended.3) The DATA is not sorted by myisampack.4) You must rebuild the index after packing: myisamchk -raqS

I've found the performance gains and disk space savings from packed tables is worth having to rebuilding the data periodically. I'm looking forward to combine partitioning of packed and unpacked tables for an archival system.

-- JJ --

Posted by
Mark Robson
on
January 28, 2008

Another important note:

The mmap() behaviour described here is not optional and may cause a 32-bit server to run out of address space sooner than it otherwise would.