Semarchy Fuzzy lookup classification enricher
The Semarchy Fuzzy lookup classification enricher automatically classifies records using SemQL-based fuzzy lookup rules.
Plugin ID
Semarchy Fuzzy Lookup Classification Enricher - com.semarchy.engine.plugins.fuzzy.lookup.classification
Description
The Fuzzy lookup classification enricher automatically classifies records by applying SemQL-based fuzzy lookup rules. It compares child records with parent records, evaluates attribute similarity, and generates a lookup score to identify the most suitable reference.
This enricher serves as an alternative to AI-based classification methods, offering a rule-based approach for categorizing data.
Plugin parameters
The following table lists the plugin parameters.
| Parameter name | Mandatory | Type | Description | ||
|---|---|---|---|---|---|
| Creation Threshold | Yes | Integer | The score threshold above which the reference ID is automatically assigned after a lookup (range: 0-100). | ||
| Entity Name | Yes | String | The name of the entity to which the fuzzy lookup rule applies. 
 | ||
| Fuzzy Lookup Rule | Yes | String | The fuzzy lookup rule to apply. | 
Plugin inputs
The following table lists the plugin inputs.
| Input name | Mandatory | Type | Description | ||
|---|---|---|---|---|---|
| Data Location (System-Managed)* | Yes | String | The name of the data location containing the child and parent records to compare. For internal use only; no configuration required. | ||
| Load ID (System-Managed)* | Yes | String | The identifier of a specific load of records to compare. For internal use only; no configuration required. | ||
| Username (System-Managed)* | Yes | String | The name of the connected user. For internal use only; no configuration required. | ||
| User roles (System-Managed)* | Yes | String | The list of roles for the connected user. For internal use only; no configuration required. | ||
| View Type (System-Managed)* | Yes | String | The type of view for the attributes to evaluate against the reference record. For internal use only; no configuration required. | ||
| Record ID | No | String | The unique identifier of the reference record, applicable to basic entities and fuzzy-matching entities with enrichment scopes set to pre- and post-consolidation, post-consolidation, or none. 
 | ||
| Publisher ID** | No | String | The identifier of the source publisher system for the reference record, applicable to fuzzy-matching entities with enrichment scope set to pre-consolidation. Use in conjunction with Source ID. | ||
| Source ID** | No | String | The identifier of the reference record in the source publisher system, applicable to fuzzy-matching entities with enrichment scope set to pre-consolidation. Use in conjunction with Publisher ID. | 
* These parameters are automatically set upon the enricher execution to specific SemQL variables and do not require configuration. Any modifications made to these parameters will be ignored.
** Use these parameters in conjunction with each other to ensure proper functionality.
| Since the enricher currently supports only basic entities, the Source ID and Publisher ID parameters are not applicable. | 
Plugin outputs
The following table lists the plugin outputs.
| Output name | Type | Description | 
|---|---|---|
| Best Match ID (NUMBER) | Number | The ID of the referenced entity with the highest lookup score, represented as a number. | 
| Best Match ID (STRING) | String | The ID of the referenced entity with the highest lookup score, represented as a string. | 
| Best Match ID (UUID) | UUID | The ID of the referenced entity with the highest lookup score, represented as a UUID. | 
| Lookup Score | Number | A numerical value that represents the degree of similarity between a record and its reference, according to the specified fuzzy lookup rule. | 
| Fuzzy Lookup Rule Name | String | The name of the fuzzy lookup rule applied to identify the most suitable reference. | 
Examples and use cases
Relevant use cases for fuzzy lookup classification may include:
- 
Product categorization in e-commerce: classify products in a consolidated catalog when descriptions from multiple suppliers differ in formatting, terminology, and structure, which may make standardization challenging. 
- 
Patient record matching in healthcare: link patient records from multiple clinics to a master index by evaluating similarities in names, birth dates, and addresses, even when information includes misspellings, address changes, or incomplete details. 
- 
Employee record validation in HR systems: consolidate employee records from regional systems into a global directory while resolving discrepancies in names, IDs, or contact information that hinder accuracy.