Write the gpfdist Configuration

The gpfdist configuration is specified as a YAML 1.1 document. It specifies rules that gpfdist uses to select a Transform to apply when loading or extracting data.

This example gpfdist configuration contains the following items:

the config.yaml file defining TRANSFORMATIONS

the input_transform.sh wrapper script, referenced in the config.yaml file

the input_transform.stx joost transformation, called from input_transform.sh

Aside from the ordinary YAML rules, such as starting the document with three dashes (---), a gpfdist configuration must conform to the following restrictions:

a VERSION setting must be present with the value 1.0.0.1.

a TRANSFORMATIONS setting must be present and contain one or more mappings.

Each mapping in the TRANSFORMATION must contain:

a TYPE with the value ‘input’ or 'output’

a COMMAND indicating how the transform is run.

Each mapping in the TRANSFORMATION can contain optional CONTENT, SAFE, and STDERR settings.

The following gpfdist configuration called config.YAML applies to the prices example. The initial indentation on each line is significant and reflects the hierarchical nature of the specification. The name prices_input in the following example will be referenced later when creating the table in SQL.

The COMMAND setting uses a wrapper script called input_transform.sh with a %filename% placeholder. When gpfdist runs the prices_input transform, it invokes input_transform.sh with /bin/bash and replaces the %filename% placeholder with the path to the input file to transform. The wrapper script called input_transform.sh contains the logic to invoke the STX transformation and return the output.