Getting started with CDC for DB2/400

The CDC for DB2/400 (DB2 for i) captures data change in IBM DB2/400 database tables using triggers or a native mode using the database journals.

This CDC component provides two actions that appear in the palette of the process editor, in the Tools section:

Refer to Change Data Capture for more information about using captured changes in mappings.

Overview

The CDC for DB2/400 captures data change using a trigger mode or a native mode:

  • With the trigger mode (default configuration), the CDC DB2/400 action automatically creates a set of triggers on the source tables. These triggers detect changes in the source table and insert them into CDC tables.

  • With the native mode, changes are not automatically propagated to the CDC table. Users must explicitly design a process with the CDC DB2/400 Read Journal Data action. This action extracts changes from the database journals and inserts them into the CDC tables. This process should be scheduled with the frequency required for changes, or be part of the integration process that requires these changes.

CDC tables and views

CDC uses the following objects, created in the work schema or the same schema as the source table if no work schema is declared:

  • CDC_SUBSCRIBERS table: This technical table is created by the CDC DB2/400 action the first time it runs in the concerned schema. It contains the list of tables and subscribers for whom CDC has been activated, the type of CDC activated, and for the native mode the timestamp of the last extraction from the corresponding journals.

For each source table the CDC DB2/400 action also creates the following objects:

  • CDC_[Source table name]: Table loaded with the data changes by the triggers or the CDC DB2/400 Read Journal Data action.

  • VCDC_[Source table name]: View that contains the latest version of the changed records.
    For example, if a record is modified 5 times since the last execution of the process, this view will only contain one record with the latest version of the modified record as opposed to CDC_[Source table name] which will contain one row per change.

  • VCL_[Source table name]: View containing the rows being consumed by a mapping or process, for locking purposes. After consumption, the Semarchy xDI process removes the consumed changes. This view is used as a source when CDC is activated on the mapping.

  • HCDC_[Source table name]: View that contains all the data for historization. It is only available if Historize All Data is selected in the CDC configuration. This view contains the list of inserts, deletes, and updates since the last execution. It can be used when you want to load your target table with the complete history of source table changes and then rebuild the source table with a select query that is directly used on the target table.

In the trigger mode, three triggers are created on the source tables:

  • Insert trigger: Captures inserts made to the source table.

  • Update trigger: Captures updates made to the source table.

  • Delete trigger: Captures deletes made to the source table.

In the native mode, an additional technical table is created:

  • CDC_[Source table name]_all: Table that contains transactional information (committed and rollbacked transactions) from the journals.

Requirements

Common requirements

Both trigger and native CDC require primary or alternate keys on the source table. This is required for the creation of CDC views used as sources in mappings and processes.

Native mode Requirements

For the native mode, the following additional requirements must be met:

  • The DB2/400 database version must be at least 7.3.

  • The user used to connect to the database must have sufficient privileges to:

    • Run the QSYS2.DISPLAY_JOURNAL table function.

    • Read data from the QSYS.QADBXREF, QSYS.JOURNALED_OBJECTS, and QSYS.QADBIFLD system tables.

  • You must activate Journaling on the source tables before reverse-engineering them in xDI. When activating journaling on the tables, make sure to write both the before and after record images into the journals, using the IMAGES(*BOTH) of the STRJRNPF command.

Configure and use CDC

Trigger mode

To configure CDC in the trigger mode:

  1. Create a CDC management process with the CDC DB2/400 action.

  2. From the project explorer, drag and drop on the action’s metadata reference mappings cdc ref the datastore which data you want to capture.

  3. In the action properties

    • Set the CDC Operation to START

    • Set the Capture Method property to Trigger.

  4. Save the process.

Executing this process creates the CDC tables, views, and the triggers that start capturing the changes.

Native mode

To configure CDC in the native mode:

  1. Create a CDC management process with the CDC DB2/400 action.

  2. From the project explorer, drag and drop on the action’s metadata reference mappings cdc ref the datastore which data you want to capture.

  3. In the action properties

    • Set the CDC Operation to START

    • Set the Capture Method property to Native.

    • Select the Boundary Calculation Method to use.

  4. Save the process.

Executing this process creates the CDC tables and views but does not start capturing changes the journals. This capture is performed by the CDC DB2/400 Read Journal Data action.

To capture changes from the journals:

  1. Create a process with the CDC DB2/400 Read Journal Data action.

  2. From the project explorer, drag and drop on the action’s metadata reference mappings cdc ref the datastore which data you want to capture.

  3. Optionally, in the Exclude Job Names property of the action, enter the list of jobs for which you do not want to capture changes.

  4. Save the process.

Running this process captures the changes from the journals and writes them to the CDC Tables.

Journals are only available for a limited amount of time. Therefore, integration designers should be cautious when designing the processes that capture changes from the journals and follow best practices:

  • Ensure timely capture of changes from these journals by scheduling these processes.

  • Make sure these processes report immediately possible issues.

  • Consider pausing these processes when an error occurs to avoid stacking errors and having inconsistent data integrated. You can achieve this by scripting an action that executes the pause trigger [triggerName] runtime command.

With the native CDC, make sure to pause all processes running CDC DB2/400 Read Journal Data actions when adding new subscribers.