file:// Protocol

file:// Protocol

The file:// protocol is used in a URI that specifies the location of an
operating system file. The URI includes the host name, port, and path to the file. Each file
must reside on a segment host in a location accessible by the Greenplum superuser (gpadmin). The host name used in
the URI must match a segment host name registered in the
gp_segment_configuration system catalog table.

The
number of URIs you specify in the LOCATION clause is the number of segment
instances that will work in parallel to access the external table. For each URI, Greenplum assigns a primary segment on the specified host to the file.
For maximum parallelism when loading data, divide the data into as many equally sized files as
you have primary segments. This ensures that all segments participate in the load. The number
of external files per segment host cannot exceed the number of primary segment instances on
that host. For example, if your array has four primary segment instances per segment host, you
can place four external files on each segment host. Tables based on the
file:// protocol can only be readable tables.

The system view pg_max_external_files shows how many external table files
are permitted per external table. This view lists the available file slots per segment host
when using the file:// protocol. The view is only applicable for the
file:// protocol. For example: