Getting started with the Greenplum database

Greenplum database

This page contains basic information to help you start working with Greenplum.

Connect to your data

The database structure can be fully reverse-engineered and stored in xDI Metadata. You can then use it when designing mappings and processes to adapt your business rules according to your requirements.

Refer to Connect to your data for more information.

Work with mappings

Refer to Work with mappings for more information.

Load file data from Amazon S3 to Greenplum

xDI Designer provides a LOAD S3 File to Greenplum template for loading data from Amazon S3 to Greenplum. This template is optimized to better handle both delimited text files, and positional text files with fixed-width columns.

To load data from files located inside an Amazon S3 bucket to Greenplum, set up your project as follows:

  • Create a mapping which connects a file metadata as input to a Greenplum table.

  • Make sure your file metadata contains a link to an Amazon S3 bucket node:

    1. Open the file metadata to its own tab.

    2. Drag the Amazon S3 bucket node to the file icon in the file metadata tab to create the link.

    3. Change the link name to S3_CONTAINER.
      getting started greenplum s3 bucket

  • Edit your mapping so it uses the LOAD S3 File to Greenplum template.

Designer uses the Amazon S3 metadata link to represent where the file is located in Amazon S3. If you are working with multiple files that are all in the same bucket, you can instead place the Amazon S3 link directly at the folder level for these files, in the file metadata.

Template parameters

The LOAD S3 File to Greenplum template exposes process-specific parameters.

Parameter Default value Description

Profile

s3:csv

Platform Extension Framework connector to use in CREATE EXTERNAL TABLE statements.

Format

TEXT

Defines whether the file format is set to TEXT or CSV in CREATE EXTERNAL TABLE statements.

PFX Server Name

Server configuration directory from which the Platform Extension Framework obtains a configuration and credentials to access the external data store.

Check S3 File Existence

false

Allows the template to process success or fail conditions by checking that a file exists on S3.

Region

Amazon S3 region the bucket is in.

S3 Base Url

URL of S3 provider, if it is different than the public Amazon S3 service.

Path-Style Access

Use a path-style URL rather than a virtual-hosted-style URL.

Success If No File

false

Used when Check S3 File Existence is set to true.

false: the template causes an error if the source file is missing.
true: the template finishes successfully with an empty load table.

String Delimiter

Characters which replace quotation marks as string delimiters in Greenplum CREATE EXTERNAL TABLE statements.

If this parameter is empty in the template, Designer will look for the same parameter in metadata.
If this parameter is also empty in metadata, CREATE EXTERNAL TABLE statements will use the default double quotation mark as a string delimiter.