gpreload

Synopsis

Description

The gpreload utility reloads table data with column data sorted. For tables that
were created with the table storage option APPENDONLY=TRUE and compression
enabled, reloading the data with sorted data can improve table compression. You specify a
list of tables to be reloaded and the table columns to be sorted in a text file.

Compression is improved by sorting data
when the data in the column has a relatively low number of distinct values when compared to
the total number of rows.

For a table being reloaded, the order of the columns to be sorted might affect compression. The
columns with the fewest distinct values should be listed first. For example, listing state
then city would generally result in better compression than listing city then state.

public.cust_table: state, city
public.cust_table: city, state

For
information about the format of the file used with gpreload, see the
--table-file option.

Notes

To improve reload performance, indexes on tables
being reloaded should be removed before reloading the data.

Running the
ANALYZE command after reloading table data might query performance
because of a change in the data distribution of the reloaded data.

For each table, the utility copies table data to a temporary table, truncates the existing
table data, and inserts data from the temporary table to the table in the specified sort
order. Each table reload is performed in a single transaction.

For a partitioned table, you can reload the data of a leaf child partition. However, data
is inserted from the root partition table, which acquires a ROW EXLCUSIVE
lock on the entire table.

Options

-a (do not prompt)

Optional. If specified, the gpreload utility does not prompt the user
for confirmation.

-d database

The database that contains the tables to be reloaded. The gpreload
utility connects to the database as the user running the utility.

-p port

The Greenplum Database master port. If not specified, the value of the
PGPORT environment variable is used. If the value is not available,
an error is returned.

{-t | --table-file } path_to_file

The location and name of file containing list of schema qualified table names to
reload and the column names to reorder from the Greenplum Database. Only user defined
tables are supported. Views or system catalog tables are not supported.

If indexes are defined on table listed in the file, gpreload prompts
to continue.

Each line specifies a table name and the list of columns to sort. This is the format
of each line in the file:

schema.table_name: column [desc] [, column2 [desc] ... ]

The table name is followed by a colon ( : ) and then at least one column name. If you
specify more than one column, separate the column names with a comma. The columns are
sorted in ascending order. Specify the keyword desc after the column
name to sort the column in descending order.

Wildcard characters are not supported.

If there are errors in the file, gpreload reports the first error and
exits. No data is reloaded.