Semarchy xDM

This source extracts metadata from Semarchy xDM

Overview

This source connects to the Semarchy xDM server to retrieve the data location assets (entities, attributes, certification jobs, etc.), then to the underlying database hosting this data location to retrieve the underlying physical assets (tables, columns).

The underlying data location database is configured as an inner_source.

This source supports:

  • Metrics retrieval for data location assets. For example, the number of golden records for entities.

  • Stateful Ingestion for both data location and underlying physical assets.

  • Data Profiling to collect table, row, and column statistics for the underlying physical assets.

  • Set the Domain for the underlying physical assets.

  • Filter Assets for the underlying physical assets.

Sample recipe

Example 1. Semarchy xDM Source sample recipe.
source:
  type: semarchy-xdm
  config:
    xdm_base_url: 'http://localhost:8080'
    xdm_dataloc: CustomerB2CDemo
    xdm_api_key: <api-key>
    # xdm_api_username: <user>
    # xdm_api_password: <password>
    # disable_ssl_verification: true

    inner_source:
        # Configure the inner source depending on the underlying
        # data location database.
        type: postgres
        config:
          host_port: localhost:5432
          database: semarchyDemoDatabase
          username: username
          password: password
          include_tables: true
          include_views: true
          profiling:
            enabled: true
            profile_table_level_only: false
          schema_pattern:
            allow:
            - semarchy_customer_b2c_mdm

sink:
  # sink config

Parameters

The following table lists the source parameters.

Parameter

Description

xdm_base_url

Base URL of the Semarchy xDM application server.

xdm_dataloc

Name of the data location to harvest.

xdm_api_key

Semarchy xDM API Key parameter to connect to xDM server. Use this API Key instead of the xdm_api_username and xdm_api_password parameters.

api-key authentication will require semarchyConnect & semarchyAdmin roles in xDM.

xdm_api_username

Semarchy xDM user. You can also use the xdm_api_key parameter to connect to Semarchy xDM

xdm_api_password

This user’s password.

disable_ssl_verification

Option to disable SSL verification.

inner_source

Database source configuration corresponding to the underlying database of the data location. This configuration is a regular PostgreSQL, Oracle or Microsoft SQL Server source configuration.

In this configuration, make sure to use the schema_pattern to limit the inner source harvesting to the tables located in the data location schema.

Supported version

The Semarchy xDM harvester is compatible with:

  • Semarchy xDM 2024.3.0 MS

  • Semarchy xDM 2024.1.0 LTS and above

  • Semarchy xDM 2023.1.8 LTS and above