Getting started with Sampling Component

This page contains information to help you get started with Sampling in Semarchy xDI.

Overview

This Component allows to produce sample data based on an existing database scheme.

The main idea is to specify a sampling percentage on some tables and propagate the sampling on the other tables with the primary & foreign keys.

Below, an example:

sampling tool overview

In this example, T_CUSTOMER table has originally 100 records and is configured to keep at the end (on target schema) only 10% of those records.

In the same way, T_BEDROOM table has originally 20 records and is configured to keep only 20% of its records.

There is a foreign key on T_PHONE table related to T_CUSTOMER.CUS_ID primary key. So, T_PHONE will keep only the records related to the 10% of record kept on T_CUSTOMER tables…​

T_BREAKFAST_PRICE table is an isolated table (without links on other tables) and is not configured to have a sampling of its records. So, all the content of this table will be kept on target schema.

Metadata configuration

When installed, the Sampling Components adds new properties in databases Metadata, under the "Sampling" tab.

This allows to configure how the sample data will be produced.

The percentage of data to keep can be specified, alongside with optional filters.

sampling tool metadata overview

The SQL Order By property allows to apply a sort when reading the data to sample. As only a percentage of records will be kept, this property can be useful to order the data, because the extracted data are the first records returned.

Sampling data

When you have defined your sampling rules in your Metadata, you can use the dedicated Process Tool to sample corresponding data:

  1. Create a Process

  2. From the Process Paletten add the Sampling RDBMS Process Tool.

  3. Drag and drop the source schema (to sample) Metadata Link on SOURCE

  4. Drag and drop the target schema Metadata Link on TARGET

  5. Drag and drop a folder Metadata Link (a path for the log directory)

Below, an example:

sampling tool process

Then, set the parameters of the tool accordingly to your needs:

sampling tool properties

Sample project

The Sampling component is distributed with sample projects that contain various examples and files. Use these projects to better understand how the component works, and to get a head start on implementing it in your projects.

Refer to Install components in Semarchy xDI Designer to learn about importing sample projects.