Sinks
Sinks are the destination of a harvesting process. This destination is typically Semarchy xDG, but you can also use a file or the console as the destination for harvesting.
xDG
This sink pushes metadata to Semarchy xDG using its harvesting API. You need a Personal Access Tokey to use the API. See Install and run the harvesting client to configure this token.
source:
# source configuration
sink:
type: "datahub-rest"
config:
server: "https://<your-tenant-name>.semarchy.net/api/xdg/v1/catalog"
token: "<your-personal-access-token>"
Parameters
The following table lists the sink parameters.
Parameter | Mandatory | Description |
---|---|---|
|
Yes |
URL of the Semarchy xDG site. |
|
Yes |
Personal access token used for authentication. |
|
No |
Timeout in seconds for the HTTP requests made to the API. Defaults to 30 seconds. |
|
No |
Maximum number of retries for failed HTTP requests. The delay between requests increases exponentially. Defaults to 1. |
|
No |
Also retry HTTP requests failing with these codes. Defaults to |
|
No |
Experimental: Number of parallel threads for REST API calls. Defaults to 15. |
File
This sink writes the metadata events generated by the harvesting process to a file. You can use the generated file using the File.
Using this sink, you can decouple metadata extraction from pushing this metadata to Semarchy xDG. |
source:
# source configuration
sink:
type: "file"
config:
filename: ./path/file.json