GCS Copy Blobs

Tools

Description

Use this tool to copy blobs inside the same Google Cloud storage area.

This tool copies single or multiple blobs, within or across buckets, optionally renaming blobs as it copies. The tool can filter blobs with include rules, exclude rules, or metadata filtering.

To use this tool, define a Google Cloud Storage source from which you want to copy blobs, and a target to which you want to copy blobs.

Usage

  1. Add the process action TOOL GCS Copy Blobs from the Process Palette, under the Tools section.

  2. Select a source and target:

    • Drag and drop one of the following Google Cloud Storage metadata nodes onto the <SOURCE> field of the tool:

      • Storage

      • Bucket

      • Folder

    • Drag and drop one of the following Google Cloud Storage metadata nodes onto the <TARGET> field of the tool:

      • Storage

      • Bucket

      • Folder

  3. Set other tool parameters as needed.

If you want to copy blobs within the same bucket, you do not have to define both a source and a target. Define one of the two, then use parameters to define which blobs to copy.

The tool inherits parameters from the metadata node you drag onto it.

Parameters

Name Default Description

XPath Expression For Source

$SOURCE

A valid XPath expression referencing a Bucket to use as a source location. The expression can return a storage, bucket, or folder node from a Google Cloud Storage metadata object.

You must specify at least one of this or the XPath Expression For Target parameters.

The source bucket location is searched in this bucket or directory unless you set one of these parameters instead:

  • Source Bucket Name

  • Source Directory Path

  • Source Blob Name

Source Bucket Name

Manual entry of the source bucket name.

You can omit this parameter if XPath Expression For Source returns a valid reference to a bucket or one of its children.

Source Directory Path

Manual entry of the source directory. You can omit this parameter if XPath Expression For Source returns a valid reference to a directory or one of its children, or if the bucket itself is the root directory.

For better performance, use a directory as the source, or set this parameter for any static subdirectories. For example, specify this:

  • Source Directory Path → tmp

  • Source Blob Includes → *.txt

instead of this:

  • Source Directory Path → <empty>

  • Source Blob Includes → tmp/*.txt

Source Blob Includes

A list of blobs to include in the operation, as a semicolon-separated list of blob masks. An empty value matches all blobs.

When the source is a directory, or if you set the Source Directory Path, the blob mask evaluates inside this directory.

The following wildcard characters are supported:

?

Matches one character in a segment of the blob’s path.

*

Matches zero or more characters in a segment of the blob’s path.

**

Matches zero or more segments of the blob’s path of the blob.

Examples:

  • to retrieve XML and JSON blobs in the current directory: *.xml;*.json

  • to retrieve XML blobs in any test subdirectory: **/test/*.xml

When this parameter is set, the tool ignores the Source Blob Name parameter.

Source Blob Excludes

A list of blobs to exclude from the operation, as a semicolon-separated list of blob masks. An empty value matches all blobs.

When the source is a directory, or if you set the Source Directory Path, the blob mask evaluates inside this directory.

The following wildcard characters are supported:

?

Matches one character in a segment of the blob’s path.

*

Matches zero or more characters in a segment of the blob’s path.

**

Matches zero or more segments of the blob’s path of the blob.

Examples:

  • to ignore XML and JSON blobs in the current directory: *.xml;*.json

  • to ignore XML blobs in any test subdirectory: **/test/*.xml

When this parameter is set, the tool ignores the Source Blob Name parameter.

Source Metadata

One or more key-value pairs to filter blobs based on their metadata in Google Cloud Storage. The tool only processes source blobs that match these values. You can set this parameter in the form of Java properties.

For instance:

metadata1=value1
metadata2=value2
#comment
metadata3=value3

For information about metadata in Google Cloud Storage, see the official documentation.

Source Blob Name

Full path of a blob to use as the source. Use this parameter when you want to perform an operation on a single blob.

When this parameter is set, the tool ignores the Source Directory Path, Source Blob Includes, Source Blob Excludes, and Source Metadata parameters.

XPath Expression For Target

$TARGET

A valid XPath expression referencing a bucket to use as a target. The expression can return a storage, bucket, or folder node from a Google Cloud Storage metadata object.

You must specify at least one of this or the XPath Expression For Source parameters.

The target bucket location is searched in this bucket or directory unless you set one of these parameters instead:

  • Target Bucket Name

  • Target Directory Path

  • Target Blob Name

Target Bucket Name

Manual entry of the target bucket name.

You can omit this parameter if XPath Expression For Target returns a valid reference to a bucket or one of its children.

Target Directory Path

Manual entry of the target bucket directory.

You can omit this parameter if XPath Expression For Target returns a valid reference to a directory or one of its children, or if the bucket itself is the root directory.

Target Blob Name

Full path of a blob to use as the target. Use this parameter when you want to work with a single blob.

This parameter only works if you also set the Source Blob Name.

Existing Blobs Behavior

overwrite

Controls the tool’s behavior if target blobs already exist. Possible options are:

overwrite

The tool overwrites target blobs.

ignore

The tool leaves target blobs alone, with no error.

throwError

The tool leaves target blobs alone, and also throws an error.