Advanced match rules

This page describes advanced patterns for duplicate pairs detection in xDM matchers.

Match rules have access to various attributes:

  • Attributes of the entity being matched.

  • Attributes from related parent entities of the entity being matched.

  • Attributes from related child entities of the entity being matched.

Using these three types of attributes, you can configure both simple and complex match rules for advanced matching patterns.

Symmetric vs. asymmetric match rules
  • Match rules are considered symmetric when they treat both records equally, meaning the order in which the records are compared does not affect the match result.
    For instance, based on the definition of the Record1 and Record2 matching logic elements, the following rule would match records irrespective of which one is new and which is existing:

    Record1.Email = Record2.Email

    Symmetric rules are well suited for incremental data loads, as they ensure consistent matching results regardless of the load order.

  • Conversely, in asymmetric match rules, Record1 and Record2 are not interchangeable; the matching process depends on which record is considered the incoming record and which is the existing one. For example, the rule below would function correctly when Record1 is the incoming record, but could fail if the roles of Record1 and Record2 were reversed:

    Record1.CustomerID = Record2.ClientNumber AND Record1.ClientNumber IS NULL

    This asymmetry can lead to inconsistencies, such as missed matches or incorrect data consolidation, which compromise the reliability of the matching process.

Asymmetric rules generally produce consistent and expected results during initial data loads, where all records are considered new and comparable. However, for incremental loads, symmetric rules are recommended to ensure consistent matching outcomes, regardless of the record comparison order.
To avoid potential issues with asymmetric rules, designers can create a mirror match rule with the same confidence score. For example, the mirror rule for the earlier example would be:

Record2.CustomerID = Record1.ClientNumber AND Record2.ClientNumber IS NULL

Matching on parent records

This capability is available for all entities. You can access attributes from parent entities using regular SemQL expressions.
For example, if you have a Contact entity with a parent Customer entity and you want to match contacts when they work for customers with the same name, you would use the following syntax:

Record1.Customer.Name = Record2.Customer.Name
In this expression, Customer refers to the role name of the customer entity within the relationship.

Matching self-references

A common scenario in matching parent records involves self-referencing entities, where an entity has a reference pointing to itself. For example, a Folder entity might include a FolderName attribute and a ParentFolder foreign attribute pointing to the same Folder entity.

To match folders with the same FolderName attribute under the same ParentFolder attribute, you need two match rules:

  1. A rule for the root folders, matching based on FolderName for records where both the source ID and source publisher values are null (PublisherID_ParentFolder and SourceID_ParentFolder):

    	Record1.PublisherID_ParentFolder is null and Record1.SourceID_ParentFolder is null and
    	Record2.PublisherID_ParentFolder is null and Record2.SourceID_ParentFolder is null and
    	Record1.FolderName = Record2.FolderName
This rule does not check whether FID_ParentFolder is null because this will be true for all records during the initial load (as the FID is the result of the consolidation). Using PublisherID_ParentFolder and PublisherID_ParentFolder is effective in all cases.
  1. A rule for child folders, matching based on FolderName for records under the same consolidated golden parent (with the same FID_ParentFolder):

    	Record1.FID_ParentFolder is not null and
    	Record2.FID_ParentFolder is not null and
    	Record1.FID_ParentFolder = Record2.FID_ParentFolder and
    	Record1.FolderName = Record2.FolderName

With this type of match rule, if the golden ID of a folder changes (e.g., due to a merge with another folder), a notification is stored for the child records of this folder. When the Folder entity is processed again, these child records will be automatically reprocessed and may be merged with their new siblings.

You can automate post-processing using the PARAM_CHILD_POSTPROCESSING_JOB job parameter. This parameter detects notifications and triggers a job to process the child records as needed.

Matching on child records

This capability is available only for fuzzy-matching entities. You must specify the child entity used for matching in the rule.
For example, if you have a Contact entity with a child EmailAddresses entity and you want to match contacts using their email addresses, then you must configure a match rule in the Contact matcher as follows:

  • Select Match on Child Records.

  • Set Child Records to EmailAddresses.

  • Define the Binning Expressions and a Matching Condition properties using attributes of the EmailAddresses entity.

If any pair of child records matches according to the rule, the two parent records will be considered a match. For example, using the following expression in the match rule will match contacts if they have at least one matching email address.
Record1.Address = Record2.Address

Match rules for applications

Match rules designed to match master records during the certification process are also used to identify existing matches when a user creates a new golden record from an application.

If a match rule includes an attribute that is available only for master records and not for golden records (such as PublisherID or SourceID), then the rule will be ignored when detecting duplicates during golden record creation in applications.