Create Your First Data Flow With Semarchy xDI

Welcome to Semarchy xDI!

This tutorial shows you how to create your first data flow with Semarchy xDI Designer. This integration flow will load data from a file to a database.

What you'll learn

Project creation
Metadata creation and reverse-engineering
Mapping design and execution
Process generation

Before you start

For this tutorial, we assume that you have already installed Semarchy xDI Designer.

If you have not installed the Designer yet, follow the Install Semarchy xDI Designer tutorial from the main tutorials page. Additional setup information can also be found in the installation guide.

GO TO TUTORIALS

Enjoy this tutorial!

Before designing your first data flow, you must start the tutorial environment, composed of

Two local HSQL databases.
A local Runtime. The runtime is a lightweight that runs the integration flows.

To start the tutorial environment:

Open the Runtime view and click the Environment button.

The Environment dialog opens. Click the Start local Runtime button. The runtime process starts in a shell window.

Click the Start demo Databases button. The tutorial databases are started in separate shell windows.
Click OK to close the dialog.
In the Runtime view, select Connect to connect to the local runtime you just started. The icon appears when the connection is established.

Make sure that the Refresh Diagram checkbox is selected so that xDI Designer refreshes its views and editors (statistics, variables, colors, statuses...) in real-time during the execution of your data flows.

In the next step, you will create your first project.

A Project is a directory inside your xDI workspace that contains your development resources. All the resources in a project (metadata, mappings, processes, etc) are stored in this directory and can be archived and manipulated just like standard files. You can also organize them into folders.

Create the project for this tutorial:

In the File menu, select New > Project... The New Project wizard opens.

Expand the General node, select Project, and then click Next.

In the Project name field, enter Tutorial - Fundamentals and then click Finish.

In the next step, you will add the first metadata to your project.

Semarchy xDI leverages Metadata to design, generate, and run data integration processes. Metadata typically represents the applications, databases, APIs, or file directories containing data that you want to integrate.

The basic element in a metadata is called a datastore. A datastore corresponds to a data structure, for example, a table, a file, etc. that you integrate. Datastores are used as sources and/or targets in mappings.

You define the structure of an xDI metadata by connecting to the database server, file system, etc., and retrieving the structure of its tables, files, etc. This operation is called reverse-engineering.

In this step, you will:

Create a metadata for the demo Datamart database you started in the Start the tutorial environment step.
Reverse-engineer the datastores in this database.

Create the metadata

First, create the metadata in xDI:

In the Project Explorer view, right-click the Tutorial – Fundamentals project.
Select New > Metadata... - The New Metadata wizard opens.

Select Hypersonic SQL in the list of technologies and then click Next.

Enter Datamart in the Metadata Model name and then click Next.

In the Module Selection/Creation step, select the HSQL [0] Type (it is selected by default), select <new HSQL Module> in the Module pane, and finally click Finish to create the module and the metadata.

The metadata is created and the Server Wizard opens.

In the next section, you will connect to the database and reverse-engineer its tables.

Define the connection and schemas properties

The Server Wizard allows connecting to the database server and entering the properties for the schemas xDI will use on this database.

On the Connection page of the wizard, enter the following values and then click Connect to test your connection.

Name: Datamart
Module: leave this field unchanged.
Driver: leave this field unchanged.
URL: jdbc:hsqldb:hsql://localhost:62211
User: sa
Password: leave this field empty.

Semarchy xDI Designer opens a JDBC connection to the database. Click Next to define the schema containing the tables.
In the properties page, click Refresh Values to fetch the list of schemas in this database.
Select the HOTEL_DATAMART Schema Name and then click Next to launch the reverse-engineering process.

Reverse-engineer the tables

On the Reverse page:

Click Refresh to retrieve the list of tables from the HOTEL_DATAMART schema.
Click Select All and then click Finish to start the reverse-engineering process and complete the wizard.

Review the tables

The wizard closes and the metadata editor opens.

The asterisk next to the metadata's file name indicates that the metadata is modified but not saved. Click the icon in the Designer toolbar, or press CTRL+S to save the metadata.

Explore the data model you reverse-engineered:

In the metadata editor, Expand the Datamart root node and then expand the HOTEL_DATAMART schema node.
The tables you reverse-engineered are now available as datastores in the metadata.
Expand each datastore to review its structure: columns, primary key, foreign keys, etc.

Congratulations

Great job! You have successfully created your first metadata and reverse-engineered its structure.

In the next step, you will create another metadata for a source file.

You will now create a metadata for a local file directory. Files in this directory will be available as datastores in the mappings.

Create the metadata

Start by creating the metadata:

In the Project Explorer view, right-click the Tutorial – Fundamentals project.
Select New > Metadata...
Select the File metadata type and then click Next.

Set the Metadata Model name to Reference Files and click Finish.
The File Wizard wizard opens.
Enter the following Name for the root directory: Reference Files Folder.

Click Browse to select the local folder where the tutorial files are stored. This directory is located in your xDI Designer installation:

Windows or Linux: Select the Runtime/samples/files directory under the root folder of your xDI Designer installation.
macOS: Double-click the Semarchy xDI Designer application in your Application folder, navigate to Contents/MacOS/runtime/samples/files, and then click Open.

Click Next.

You can now define the file format.

Define the file format

On the Properties page of the wizard:

Click Browse and select the DiscountRanges.txt file. This file contains comma-separated fields.
Make sure that the selected Type is DELIMITED.
Set the Field Separator to , (a single comma).
Click Refresh to preview the file contents.

The Preview shows that the file contains a header line with the name of each field.
Set the Header Line Position to 1 to use the first line of the file as the header.
Click Refresh again. The first line is no longer visible in the preview.
Click Next to define the column for this datastore.

Reverse-engineer the columns

On the Reverse page of the wizard:

Click Reverse in order to reverse-engineer the file columns.
Check that the columns are correctly defined as shown below

Click Finish to complete the definition of this datastore.

Review the file datastore

The wizard closes and the metadata editor opens, unsaved. Click the icon in the Designer toolbar or press CTRL+S to save the metadata.

To review the data in the file datastore:

In the metadata editor, expand the Server node and then the Reference Files Folder node.
Right-click the DiscountRanges datastore and select Actions > Consult Data (console).
The data in the file appears in the Console view.

Semarchy xDI Designer also includes a feature to query datastores, using SQL:

Right-click DiscountRanges in the metadata editor and select Actions > Consult Data.
The Query Editor for DiscountRanges opens and shows the query used to retrieve data from the file.
Click or press CTRL+Enter to run the query.

In the next step, you will design a first mapping that uses the two metadata you have created.

A mapping defines how to load data from source datastores to target datastores, possibly with transformation. This definition includes templates that generate an optimized data integration process.

In this step, you will create a simple mapping to load the DIM_DISCOUNT table with data from the DiscountRanges file.

Create the mapping

Create the mapping:

In the Project Explorer view, right-click the Tutorial – Fundamentals project
Choose New > Mapping...
Set the File name to Load DIM_DISCOUNT and then click Finish.
The mapping is created and the mapping editor opens.

Add the datastores

You must first add the datastores to the mapping:

In the Project Explorer view, expand Tutorial – Fundamentals > Datamart > Datamart > HOTEL_DATAMART.
Expand the HOTEL_DATAMART node and then drag and drop the DIM_DISCOUNT datastore from the Project Explorer view into the mapping editor.
The datasource is added to the mapping editor.

In the Project Explorer, expand Reference Files > Server > Reference Files Folder and then drag and drop the DiscountRanges datastore into the mapping editor.

In this mapping, DIM_DISCOUNT will be the "Target" datastore, and DiscountRanges the "Source" datastore.

Define the column mappings

You will now define in the mapping the expressions to load target columns from source data.

For this first mapping, the expressions are simple: each column of the DIM_DISCOUNT table is populated with the value of a corresponding column in the DiscountRanges file, as described below.

Target column	Expression
DIS_RANGE	DiscountRanges.range
DIS_MIN	DiscountRanges.min
DIS_MAX	DiscountRanges.max

To define the transformation expressions:

In the mapping diagram, drag and drop the range column from the DiscountRange datastore onto the DIS_RANGE column of the DIM_DISCOUNT datastore.
A dialog opens to select the type of link you want to define between the datastores. Select Map - current field.

Repeat the previous operation for the min and max columns. You will not have to select the type of link since it was already done.
Click the icon in the Designer toolbar or press CTRL+S to save the mapping.

Your mapping is now ready to be run.

Your mapping will run in the local runtime you started during the Start the tutorial environment step.

To run the mapping:

Make sure that you have saved the mapping.
Right-click on an empty area of the mapping editor and select Run in the context menu.

Semarchy xDI Designer runs the mapping and opens a new tab showing the generated process. The process is made up of steps. The color of each step indicates its execution status:

White: The step will not run.
Gray: The step is ready to run.
Green: The step is running.
Blue: The step finished successfully.
Yellow: The step finished with a non-blocking error.
Red: The step finished a blocking error
Orange: The step was stopped by a user or because a related parallel step failed.

When the execution is finished, check that all steps appear in blue (finished successfully).

Review the Results

The execution of this mapping produced a Session Log in the Runtime's Session Logs database. This session log contains detailed information on the executed steps.

You can review the session logs in the Designer's Sessions view. The mapping that you ran appears there.

Select the Statistic view, which shows the latest execution statistics. It shows 18 total rows and 6 rows inserted into the target datastore

Run the mapping a second time. The statistics now differ, with no row inserted or updated. The source data has not changed, so no change is applied to the target.

Check the data in the target table:

In the mapping editor, right-click the target datastore (DIM_DISCOUNT) and then select Action > Consult data
The Query Editor opens and is initialized with a select query. Run it by clicking the Run button in the editor toolbar or by pressing CTRL+Enter.

Congratulations! You have successfully created and tested your first mapping. In the next and last step of this tutorial, you will organize metadata and mappings in your workspace.

Resources in Semarchy xDI are organized in projects and folders.

All the metadata and the mapping you have created until now are located in the root folder of the Tutorial – Fundamentals project.

In this section, you will:

Move metadata to a distinct project
Create folders for mappings and processes in the Tutorial - Fundamentals project.

Create a distinct project for metadata

Grouping metadata in a dedicated project is a best practice since metadata can be shared and re-used by many projects.

Follow the same instructions as in the Create the tutorial project step to create a new project: Tutorial - Common Metadata
In the Project Explorer, select the Datamart metadata, hold the CTRL key (Command key under MacOS), select the Reference Files metadata, and then drag and drop the selection into the Tutorial - Common Metadata project.

The new Tutorial - Common Metadata project now contains all the metadata you have created:

Organize development resources with projects and folders

You will now create 2 folders under the Tutorial – Fundamentals project to store mappings and processes.

In the Project Explorer, right-click the Tutorial – Fundamentals project, select New > Folder.
Set the Folder name to Mappings and click Finish.
In the Project Explorer, drag and drop the Load DIM_DISCOUNT mapping into the Mappings folder.
Follow the same steps to create a Processes folder.

You workspace now shows the following structure:

Congratulations! You have successfully created your first data flow.

What we've covered

You created metadata for a database and a file server and reverse-engineered their structure.
You designed a simple mapping to load data from a file into a database table.
You ran this mapping and analyzed the results.
You organized the resources in your workspace using projects and folders.

What's next?

In the next tutorial unit, Work with mappings, you will learn about the Semarchy xDI's advanced mapping features such as filters, joins, and stages.