Deserialize Source Data

This document describes how to use the deserializer in mappings.

The deserializer reads semi-structured data embedded within a source datastore field and exposes it as a manipulable structure that is used to load a target datastore.

Typically, this happens when having JSON data contained in a single database column such as a LONGVARCHAR column, and you want to process it like a JSON structure. The deserializer offers the ability to manipulate this embedded JSON data with the same ease as a physical JSON file.

The deserializer templates proposed in mappings are computed from the output structure format. For example, when the output is of JSON format, JSON deserializer templates are proposed. Refer to Templates for the full list of supported formats and templates.

Create a Deserializer

To create and configure a deserializer in a mapping:

  1. Create the deserializer:

    • Drag and drop a database schema in the mapping.

      deserializer creation from schema

    • Or drag and drop a field from a datastore in the mapping and then select a database schema.

      deserializer creation from field

      The database schema that is selected is used by the deserializer as a backend engine to process the semi-structured data with the runtime capabilities. Selecting the database schema used as the backend is mandatory.
  2. Add Input.

  3. Add Output.

  4. Add Fields (Optional).

  5. Map target datastores.

Configure a Deserializer

Add Input

The input is the source datastore field containing the embedded semi-structured data.

To define the input of the deserializer, drag and drop the field that contains the embedded semi-structured data from the source datastore on Input.
deserializer add input

Add Output

The output is a reverse-engineered metadata representing the structure of the input data, such as, for example, a JSON metadata.

To define the output of the deserializer, drag and drop a previously reverse-engineered metadata representing the expected data structure on Output.
deserializer output

The output structure must match the structure contained in the input data.

Finally, use the output structure as a source to load your target datastore.

When the mapping is executed, the deserializer parses the semi-structured data from the input field and loads it into the output structure. Finally, the data from the output is loaded into the mapped fields in the target datastore.

The deserializer templates proposed are computed from the output structure format. When the output is of JSON format, JSON deserializer templates are proposed. Refer to Templates for the full list of supported formats and templates.

Add Fields

Deserializer fields are optional containers to store source columns' data alongside the output structure data.

Create fields when you need to load and process data from other columns that the one containing the semi-structured data, or when you need to join two deserializers.

To add deserializer fields:

  1. In the top left corner of the deserializer, click the create field icon.

    deserializer field

  2. Drag and drop a field from the source datastore on the field that was created.

    deserializer field mapping

  3. Repeat the procedure to add other fields.

Finally, use the fields as a source to map a target datastore or use them to join the deserializer with other datastores.

Join Deserializers

You may have more than one deserializer in your mapping. In such a scenario, you can join them using fields.

To join two deserializers:

  1. Select a deserializer and add the fields containing the data that will be used for the join. See Add Fields.

  2. Repeat the procedure for the other deserializer.

  3. Drag and drop the field from one deserializer on the field in the other deserializer.

The two deserializers are joined. Use them as a source to load target datastores.

Sample mapping

In this sample, the source table contains single columns data and a column with embedded semi-structure data.

deserializer sample source data

Through the deserializer, both are manipulated and loaded into multiple target datastores.

deserializer mapping sample

Below, an example of target datastore content, with the source semi-structured data extracted.

deserializer sample target data

Templates

The following templates are available when using the deserializer:

  • JSON Deserializers:

    • DESERIALIZER Rdbms String as Json: Generic template to parse JSON data contained in a source RDBMS column.

    • DESERIALIZER BigQuery String as Json: Technology-specific template to parse JSON data contained in a source Google BigQuery database column.

    • DESERIALIZER Greenplum String as Json: Technology-specific template to parse JSON data contained in a source Greenplum database column.

    • DESERIALIZER Hsql String as Json: Technology-specific template to parse JSON data contained in a source HSQL database column.

    • DESERIALIZER MySQL String as Json: Technology-specific template to parse JSON data contained in a source MySQL database column.

    • DESERIALIZER PostgreSql String as Json: Technology-specific template to parse JSON data contained in a source PostgreSQL database column.

    • DESERIALIZER Redshift String as Json: Technology-specific template to parse JSON data contained in a source Redshift database column.

    • DESERIALIZER Snowflake String as Json: Technology-specific template to parse JSON data contained in a source Snowflake string and VARIANT database column.

  • XML Deserializers:

    • DESERIALIZER RDBMS String as XML: Generic template to parse XML data contained in a source RDBMS column.

    • DESERIALIZER Snowflake VARIANT as XML: Technology-specific template to parse XML data contained in a source Snowflake VARIANT database column.