Publish data using the REST API
The REST API facilitates the publication of data changes and deletions to a data hub.
The REST API provides the capabilities to manage loads, which includes querying, creating, submitting, and canceling loads. It also supports the persistence of records within existing loads. Additionally, it offers a streamlined option to load and submit data in a single request.
Query existing loads
Method |
|
---|---|
Base URL |
|
URL |
|
Supported parameters |
|
Response format |
The response includes either a load count or a list of loads that meet the specified criteria. The details returned for each load vary based on its status. Query existing loads: sample response
|
Load type
loadType
indicates the nature of a load:
-
An external load (
EXTERNAL_LOAD
), which can be submitted (submittable=true
). -
A continuous load (
CONTINUOUS_LOAD
), which cannot be manually submitted but is automatically submitted everysubmitInterval
seconds. -
A load attached to application activities, which cannot be submitted:
-
WORKFLOW_SUBMIT
corresponds to a submit action performed by a workflow. -
LEGACY_WORKFLOW
corresponds to a submit action performed by a legacy workflow. It replaces theWORKFLOW
load type, which is deprecated. -
DIRECT_AUTHORING
corresponds to a submit action performed by a direct authoring action. -
DIRECT_DELETE
corresponds to a submit action performed by a direct delete action. -
DIRECT_DUPS_CONFIRM
corresponds to a submit action performed by a direct duplicate confirmation action. -
DIRECT_DUPS_MANAGEMENT
corresponds to a submit action performed by a direct duplicate manager action.
-
Load status
The table below enumerates the potential load statuses.
Load status | Description |
---|---|
CANCELED |
The load has been canceled ( |
DONE |
The job completed successfully with no validation errors. |
ERROR |
The job did not complete successfully, it was canceled by an administrator. |
PENDING |
The load has been submitted. A batch was created and is waiting for the batch poller to pick it up. |
PROCESSING |
The batch’s job is currently being processed by the engine. |
RUNNING |
The load is currently running. |
SCHEDULED |
The batch has been taken into account by the batch poller. The job is queued by the engine. |
STOPPED |
The job has been canceled. |
SUSPENDED |
The job is suspended—either by an administrator or due to an error. Administrator intervention is required. |
WARNING |
The job completed successfully, but some records have caused validation errors. |
Integration job
If a job is attached to a load that was submitted, then the integrationJob
object provides details about this job, including its start date, current task, duration, and any error that may occur during its execution.
It also includes notificationStatus
, which indicates whether notifications were successfully sent.
currentTask corresponds to the ongoing task for RUNNING jobs, and the last executed task for KILLED or SUSPENDED jobs.
|
Use the API to monitor jobs that are SUSPENDED or in ERROR status to report possible integration issues. Combine this endpoint with the capability to manage loads to automate job restarts.
|
Query a load
Method |
|
---|---|
Base URL |
|
URL |
|
Response format |
The response includes the load identified by Query one load: sample response
|
Initialize a load
Method |
|
---|---|
Base URL |
|
URL |
|
Request payload |
The request contains the Create a load: sample request
|
Response format |
The response contains the load information, including the load ID, load type, and an indication of the status. Create a load: sample response
|
Load data
To load data into a specific load, the URL must include the load ID that was returned during the creation of the load, or the ID or name of a continuous load.
Data is loaded into GD tables for basic entities, and into MD tables for ID- and fuzzy-matched entities.
|
Using the REST API for bulk data loads is not recommended due to its inherent limitations in handling large volumes of data. The REST API is designed for fast processing of a few records, making it ideal for web services and other tasks requiring quick, responsive data handling. |
Method |
|
||||
---|---|---|---|---|---|
Base URL |
|
||||
URL |
|
||||
Request payload |
The request includes the
Load data: sample request
|
||||
Response format |
The response includes, in the Submit a load: sample response
|
Configure data loads
Persist options
When loading one or more records, the following can be configured in the persistOptions
element:
-
For each entity:
-
enrichers
: defines the enrichers that should be executed before persisting the records. By default, no enricher is executed. Possible values are:-
JOB_PRE_CONSO
: runs all the enrichers configured with a pre-consolidation only or pre- and post-consolidation scope. -
ALL
: runs all enrichers defined for the entity (even those whose enrichment scope is set toNone
). -
A list of enricher names in the following format:
[ "<enricher_name>", ... ]
Similar to how step enrichers function in steppers, the enrichers specified in this section of the request payload are executed prior to the certification process. Consequently, enrichers with a pre- or post-consolidation scope will run twice. We recommend configuring them to produce consistent results after each execution.
-
-
validations
: defines the validations that should be executed after the enrichers. By default, no validation is executed. Possible values are:-
JOB_PRE_CONSO
: runs all validations configured with a pre-consolidation only or pre- and post-consolidation scope. -
ALL
: runs all the validations defined for the entity (even those whose validation scope is set toNone
). -
A list of validations with their name and type in the following format:
[ { "validationType": "<validation_type>", "validationName": "<Validation_name>" }, ... ]
Possible
validationType
values areCHECK
,PLUGIN
,MANDATORY
,LOV
,FOREIGN
, orUNIQUE
.
-
-
queryPotentialMatchesRules
: defines the match rules to use to detect potential matches. By default, no match detection is performed. Possible values are:-
ALL
: runs all match rules defined for the entity. -
A list of match rule names, in the following format:
[ "<match_rule_name>", ... ]
-
-
queryPotentialMatchesHighestScoreOnly
: when set to true, only the match found with the highest match score is returned in the response. Otherwise, all matches found are returned.The
queryPotentialMatches
parameter is deprecated and is replaced withqueryPotentialMatchesRules
andqueryPotentialMatchesHighestScoreOnly
. -
queryPotentialMatchesBaseExpressions
: defines the set of base attributes to include in the response for the potential matches found. Possible values are:-
NONE
: no attributes. -
USER_ATTRS
(default): all entity attributes, except the built-in attributes and references. -
VIEW_ATTRS
: all entity attributes, except references, but including the built-in attributes. -
ID
: only the identifier attributes. Depending on the entity type and the matched record’s location, returns the publisher ID, source ID`, golden-record ID, and/or primary key.
-
-
queryPotentialMatchesExpressions
: expressions to include in the response for the potential matches found, in addition to the base attributes (queryPotentialMatchesBaseExpressions
). These expressions are in the following format:[alias]:[semql_expression]
. -
responsePayloadRecordsBaseExpressions
: defines the set of base attributes to include in the response for the persisted records. Possible values are:-
NONE
: no attributes. -
USER_ATTRS
(default): all entity attributes, except the built-in attributes and references. -
VIEW_ATTRS
: all entity attributes, except references, but including the built-in attributes. -
ID
: only the identifier attributes. Depending on the entity type,PublisherID
,SourceID
, and/or the primary key are returned.
-
-
responsePayloadRecordsExpressions
: expressions to include in the response for the persisted records, in addition to the base attributes (responsePayloadRecordsBaseExpressions
). These expressions are in the following format:[alias]:[semql_expression]
.The SemQL view within which
queryPotentialMatchesExpressions
andresponsePayloadRecordsExpressions
are executed determines the scope of data these expressions can interact with and retrieve.-
When loading data through the REST API,
queryPotentialMatchesExpressions
searches both the MD and SD tables, whileresponsePayloadRecordsExpressions
executes either within the SD or SA views, depending on the entity type. -
When using the REST API to certify a single record,
queryPotentialMatchesExpressions
exclusively looks for potential matches within the MD table.
-
-
-
For the entire load:
-
missingIdBehavior
: option to define whether to generate IDs when they are not provided in the payload. Possible values areGENERATE
to generate the ID orFAIL
to halt loading if the ID is missing. -
persistMode
: defines whether the records should be persisted or not. Possible values are:-
IF_NO_ERROR_OR_MATCH
(default): persists a record if no validation error was raised and no potential match was found. -
ALWAYS
: always persists a record. -
NEVER
: never persists a record.
-
-
responsePayload
: specifies the content of the response payload. Possible values are:-
RECORDS
(default): details of the persisted records. -
SUMMARY
: count of persisted records. -
SUMMARY_AND_RECORDS
: count and details of the persisted records.
-
-
To prevent large payloads that can slow down response times and consume significant memory when using the REST API for data integration, setting the responsePayload to SUMMARY can enhance performance by limiting the response to a concise summary. This is especially useful when detailed record-level data is not immediately needed, enabling faster and more efficient API interactions.
|
The REST API automatically normalizes string values, regardless of the PARAM_NORMALIZE_STRING setting in the integration job. This automatic normalization ensures that all string values are uniformly handled during data publication.
|
Update existing records
The REST API allows updating golden records for basic entities and master records for ID- and fuzzy-matched entities.
Single update
An existing record is updated by providing its ID during the data loading process:
-
For basic entities, provide the ID attribute of the record to be updated.
-
For ID- and fuzzy-matched entities, provide the source ID and publisher ID of the master record to be updated.
If a record is persisted in a load with an existing record ID, a copy of the record is checked out. Changes are then applied only to the fields specified in the request body, while other attributes retain their current values.
Mass-update
With the MASS_UPDATE_DATA
action, the endpoint can be used to update multiple records within the same payload.
Method |
|
||
---|---|---|---|
Base URL |
|
||
URL |
|
||
Request payload |
The request includes the
For each entity, the following options can be set:
Mass-update data: sample request
|
||
Response format |
The response includes the status of the request, the load information, and a summary or the list of all the records updated as part of this request. Mass-update data: sample response
|
Use the restApiMassUpdateFetchBatchSize system property to change the fetch batch size when mass updating records. The default value is 1,000.
|
Set the auditing fields
By default, the auditing fields (i.e., Creator, Updator, CreateDate, and UpdateDate) are automatically set to the current username and date when a user publishes data.
However, users with specific privileges may manually set these values, enabling actions like backdating or publishing data on behalf of other users.
Users whose role has been configured with the Allow publishing as user in API option enabled in a model privilege grant can set the auditing fields when publishing data via the REST API.
Enrich, validate, and detect matches
When loading or updating data, enrichers, validations, and matchers can be executed for each entity.
Enrich and validate
For each entity to configure, one element—named after the entity—can be defined under the optionsPerEntity
element.
For each entity, it is possible to:
-
Specify a list of enrichers to run, with their enricher names.
-
Specify a list of validations to execute, with their
validationType
andvalidationName
.
Detect matches
When loading data, queryPotentialMatchesRules
can be used to specify whether the platform should check for duplicates according to the matching rules defined for the entity.
When checking for duplicates, the response includes master records that potentially match an incoming record. This helps identify the reasons for the matches.
The attributes returned can be defined to only include those required for a specific use case, with two additional properties:
-
queryPotentialMatchesBaseExpressions
defines the set of base attributes to include in the master records detected as potential matches:-
NONE
: no attributes. -
USER_ATTRS
: all attributes, except built-in attributes and references. -
VIEW_ATTRS
: all attributes, except references, but including built-in attributes. -
ID
: only the identifier attributes. Depending on the entity type and the matched record’s location, returns the publisher ID, source ID, golden-record ID, and/or primary key.
-
-
queryPotentialMatchesExpressions
defines a list of expressions to return in addition to the set of base attributes. These expressions are in the following format:<alias>:<semql_expression>
.
For example, the following request looks for potential matches.
{
"action":"PERSIST_DATA",
"persistOptions": {
"defaultPublisherId": "CRM",
"optionsPerEntity": {
"Person": {
"queryPotentialMatchesRules": ["SameExactEmailMatchName"], (1)
"enrichers":["CleanseEmail"],
"validations":[],
"queryPotentialMatchesBaseExpressions": "NONE", (2)
"queryPotentialMatchesExpressions": { (2)
"Name": "Concat(FirstName, ' ', LastName)",
"Email": "CleansedEmail",
"Golden ID": "Gold_ID",
"Master ID": "ID"
}
}
},
"missingIdBehavior": "FAIL",
"persistMode": "NEVER" (3)
},
"persistRecords": {
"Person": [
{
"SourceID": "99998",
"FirstName": "John",
"LastName": "Doe",
"DateOfBirth": "1974-01-25",
"SourceEmail": "jass@ellerbusch.com"
}
]
}
}
1 | Trigger potential match detection for records using one match rule. Note that, since the matcher uses enriched values, the CleanseEmail enricher is also triggered. |
2 | Select the information returned for the potential matches. Since queryPotentialMatchesBaseExpressions is set to NONE , only the expressions defined in the queryPotentialMatchesExpressions are returned. |
3 | This value for persistMode never persists records. This call only finds potential matches. |
The response to this request is as follows:
{
"status": "PERSIST_CANCELLED",
"load": {
...
},
"records": {
"Person": [
{
"entityName": "Person",
"recordValues": {
"CleansedEmail": "jass@ellerbusch.com",
"SourceEmail": "jass@ellerbusch.com",
"DateOfBirth": "1974-01-25",
"FirstName": "John",
"SourceID": "99998",
"PublisherID": "CRM",
"LastName": "Doe",
...
},
"failedValidations": [],
"potentialMatches": [
{
"matchRuleName": "ExactEmailMatch",
"matchScore": 74,
"matchedRecordLocation": "MD",
"matchedRecordId": {
"SourceID": "1320830",
"PublisherID": "CRM"
},
"matchedRecordData": {
"Name": "Jass Ellerbusch",
"Email": "jass@ellerbusch.com",
"Golden ID": 10002,
"Master ID": "CRM.1320830"
}
}
]
}
]
}
}
Data loading behavior
When invoked with a payload, the REST operation runs enrichers, validations, and matchers for each record, depending on the entity configuration.
It then returns:
-
The enriched data.
-
A list of validation errors (if any).
-
A list of potential matches detected by the matching rules.
Records may or may not be persisted at this stage, depending on the persistMode
option:
-
If set to
ALWAYS
, records are persisted even with errors or potential matches. -
If set to
NEVER
, records are not persisted. Use this option to perform a dry run to test your records. -
If set to
IF_NO_ERROR_OR_MATCH
(default), records are persisted only if no validation error or potential match occurs.
Publish deletions
The endpoint can be used to load data with the DELETE_DATA
action and publish record deletions.
Method |
|
||||||
---|---|---|---|---|---|---|---|
Base URL |
|
||||||
URL |
|
||||||
Request payload |
The request includes the
Submitting other information than these IDs in this property is considered an error. Delete data: sample request to delete a golden record
|
||||||
Response format |
The response includes the request status, load details, and a list of all the records deleted in this request (including child records deleted through cascading actions). Delete data: sample response
The record deletion status indicates whether the record can be deleted or not. Possible statuses are:
|
Submit a load
To submit a load, the URL must include the load ID that was returned when the load was created.
Method |
|
---|---|
Base URL |
|
URL |
|
Request payload |
The request includes the Submit a load: sample request
|
Response format |
The response includes the load’s details, including the load ID, batch ID, and an indication of its status. Submit a load: sample response
|
Cancel a load
To cancel a load, the URL must include the load ID that was returned when the load was created.
Method |
|
---|---|
Base URL |
|
URL |
|
Request payload |
The request includes only the Cancel a load: sample request
|
Response format |
The response includes the load ID as well as an indication of the status. Cancel a load: sample response
|
Load and submit data
Using the REST API, it is possible to create a load, load data (or request deletions), and submit the load in a single request.
Method |
|
||||||
---|---|---|---|---|---|---|---|
Base URL |
|
||||||
URL |
|
||||||
Request payload |
The request includes the Load and submit data: sample request
|
||||||
Response format |
The response includes the load ID, as well as an indication of the status. Load and submit data: sample response
|
Manage a load
When a load has been submitted, it is still possible to manage it using the REST API.
Method |
|
---|---|
Base URL |
|
URL |
|
Request payload |
The request includes the Manage a load: sample request
|
Response format |
The response includes the load’s details and its new state. If the requested operation is not possible, an error is returned. |
Combine this endpoint with the capability to query loads for automating production monitoring (e.g., to automatically resume suspended jobs). |