Edge Delta Google Cloud Storage Output

Configure the Google Cloud Storage output node to archive data using GCS with options for compression and encoding.

6 minute read

Overview

The GCS Destination archives items in a Google Cloud Storage destination. These items are raw archive bytes that are buffered with the archive buffer processor.

incoming_data_types: log

This node requires Edge Delta agent version v0.1.59 or higher.

Configuring GCS

Before you can create an output, you must have a Google Cloud Storage HMAC access key for a service account that contains the Storage Admin HMAC role. See how to Prepare GCS.

Example Configuration

nodes:
  - name: my_gcs
    type: gcs_output
    bucket: <REDACTED>
    hmac_access_key: <REDACTED>
    hmac_secret: <REDACTED>
    compression: zstd
    encoding: parquet
    use_native_compression: true
    path_prefix:
      order:
      - Year
      - Month
      - Day
      - Hour
      - 2 Minute
      - tag
      - host
      format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/

Required Parameters

name

A descriptive name for the node. This is the name that will appear in pipeline builder and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a - and a space followed by the string. It is a required parameter for all nodes.

nodes:
  - name: <node name>
    type: <node type>

type: gcs_output

The type parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.

nodes:
  - name: <node name>
    type: <node type>

bucket

The bucket parameter defines the target bucket to use. It is specified as a string and is required.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>

hmac_access_key

The hmac_access_key parameter is the GCS HMAC access key that has permissions to upload files to the bucket. It is used with hmac_secret. It is specified as a string and is required.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    hmac_access_key: <access key>
    hmac_secret: <key secret>

hmac_secret

The hmac_secret parameter is the GCS HMAC secret associated with the access key. It is used with hmac_access_key. It is specified as a string and is required.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    hmac_access_key: <access key>
    hmac_secret: <key secret>

Optional Parameters

buffer_max_bytesize

The buffer_max_bytesize parameter configures the maximum byte size for total unsuccessful items. If the limit is reached, the remaining items are discarded until the buffer space becomes available. It is specified as a datasize.Size, has a default of 0 indicating no size limit, and it is optional.

nodes:
  - name: <destination-name>
    type: <destination-type>
    buffer_max_bytesize: 2048

buffer_path

The buffer_path parameter configures the path to store unsuccessful items. Unsuccessful items are stored there to be retried back (exactly once delivery). It is specified as a string and it is optional.

Note: Buffered data may be delivered in non-chronological order after a destination failure. Event ordering is not guaranteed during recovery. Applications requiring ordered event processing should handle reordering at the application level.

nodes:
  - name: <destination-name>
    type: <destination-type>
    buffer_path: <path to unsuccessful items folder>

buffer_ttl

The buffer_ttl parameter configures the time-to-Live for unsuccessful items, which indicates when to discard them. It is specified as a duration, has a default of 10m, and it is optional.

nodes:
  - name: <destination-name>
    type: <destination-type>
    buffer_ttl: 20m

disable_compaction

YAML Only

This parameter configures whether to disable compaction by the Compactor Agent for data from this node before it is sent to the data destination. It is specified as a boolean, the default is false and it is optional.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    disable_compaction: true

compression

The compression parameter specifies the compression format. It can be gzip, zstd, snappy or uncompressed. It is specified as a string, has a default of gzip, and it is optional.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    compression: gzip | zstd | snappy | uncompressed

schema

The schema parameter specifies the schema format for archived data. It can be Archive or Raw. It is specified as a string, has a default of Archive, and is optional.

Archive - Uses the structured Edge Delta archive format (ArchiveLogPayload) for the archived data. This is the standard format that preserves the Edge Delta data structure. When used with deotelized data, it wraps the transformed data in the ArchiveLogPayload format.
Raw - Uploads incoming items (Log, Metric, Trace, Custom) directly as map[string]any to the archive destination, preserving the data in its raw format after any transformations. Use this option when:
- Working with deotelized data from a multiprocessor step and you want to preserve it in its transformed map[string]any format
- The development binary doesn’t handle parquet encoding properly
- You want to bypass Edge Delta’s standard archive structure

nodes:
  - name: <node name>
    type: <output type>
    schema: Archive | Raw

encoding

The encoding parameter specifies the encoding format. It can be json or parquet. It is specified as a string, has a default of json, and it is optional.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    encoding: json | parquet

flush_interval

YAML Only

The flush_interval parameter specifies the duration to flush (or force) data to the destination, including buffered data. It is specified as a duration and is optional.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    flush_interval: 10m

max_byte_limit

YAML Only

The max_byte_limit parameter specifies the maximum bytes before flushing buffered raw data to archive destination. It is specified with a data size and is optional. If not specified for this node the setting in the agent settings is used.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    max_byte_limit: 32MB

path_prefix

The path_prefix parameter configures the path prefix using order and format child parameters. It is optional.

The order child parameter lists the formatting items that will define the path prefix:

You can refer to Year, Month, Day, <any number that can divide 60> Minute, Hour, tag, host, OtherTags.<item related tags> and LogFields.<log related tags>.
For ECS, ecs_cluster, ecs_container_name, ecs_task_family and ecs_task_version are available.
For K8s, k8s_namespace, k8s_controller_kind, k8s_controller_logical_name, k8s_pod_name, k8s_container_name and k8s_container_image are available.
For Docker, docker_container_name and docker_image_name are available

The format child parameter specifies a format string that has %s as placeholders per each order item.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    path_prefix:
      order:
      - Year
      - Month
      - Day
      - Hour
      - 2 Minute
      - tag
      - host
    format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/

use_native_compression

The use_native_compression parameter configures whether, for parquet encoding, to only compress data segments for each archive file, not the whole file. It is specified as a Boolean, has a default of false, and it is optional.

nodes:
  - name: <node name>
    type: gcs_output
    bucket: <target bucket>
    use_native_compression: true

Edge Delta Google Cloud Storage Output

Overview

Configuring GCS

Example Configuration

Required Parameters

name

type: gcs_output

bucket

hmac_access_key

hmac_secret

Optional Parameters

buffer_max_bytesize

buffer_path

buffer_ttl

disable_compaction

compression

schema

encoding

flush_interval

max_byte_limit

path_prefix

use_native_compression

Edge Delta AI Assistant

Quick Topics

Recent Questions

Hi! I'm your Edge Delta AI Assistant

Current Context