Edge Delta Google Cloud Storage Output
8 minute read
Overview
The GCS Output node sends archived telemetry to a Google Cloud Storage (GCS) bucket. It supports flexible configurations to optimize cost, access, and data lifecycle governance across observability pipelines.
This output is ideal for:
- Long-term log archiving
- Routing use cases (e.g., multi-tenant or customer-specific storage)
- Integration with downstream platforms or security analytics tools
Logs are written as raw archive bytes, buffered via the archive buffer processor, and can be compressed, encoded, and organized using customizable path structures.
Key features include:
-
Dynamic bucket selection using OTTL expressions (
bucket_expression
) -
Workload Identity support via
credentials_path
or environment-based authentication -
Encoding options for
parquet
,json
, oravro
-
Scalable performance via
parallel_worker_count
and advanced pathing -
Data durability with buffering, retries, and optional compaction
- incoming_data_types: log
Configuring GCS
Before you can create an output, you must have a Google Cloud Storage HMAC access key for a service account that contains the Storage Admin HMAC role. See how to Prepare GCS.
Example Configuration

nodes:
- name: my_gcs
type: gcs_output
bucket: <REDACTED>
hmac_access_key: <REDACTED>
hmac_secret: <REDACTED>
compression: zstd
encoding: parquet
use_native_compression: true
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 2 Minute
- tag
- host
format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/
Required Parameters
name
A descriptive name for the node. This is the name that will appear in pipeline builder and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a -
and a space followed by the string. It is a required parameter for all nodes.
nodes:
- name: <node name>
type: <node type>
type: gcs_output
The type
parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.
nodes:
- name: <node name>
type: <node type>
bucket
The bucket
parameter defines the target bucket to use. It is specified as a string and is required.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
hmac_access_key
The hmac_access_key
parameter is the GCS HMAC access key that has permissions to upload files to the bucket. It is used with hmac_secret
. It is specified as a string and is required.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
hmac_access_key: <access key>
hmac_secret: <key secret>
hmac_secret
The hmac_secret
parameter is the GCS HMAC secret associated with the access key. It is used with hmac_access_key
. It is specified as a string and is required.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
hmac_access_key: <access key>
hmac_secret: <key secret>
Optional Parameters
bucket_expression
Use bucket_expression
to dynamically determine the target bucket for each log using an OTTL expression. When set, this overrides the static bucket
field. This is especially useful in AI-native or multi-tenant environments where logs must be routed to different storage buckets based on metadata (e.g., customer ID or log source). If bucket_expression
is not set, the bucket
field is used as the static destination.
nodes:
- name: my_gcs
type: gcs_output
bucket_expression: attributes["log_bucket"]
buffer_max_bytesize
The buffer_max_bytesize
parameter configures the maximum byte size for total unsuccessful items. If the limit is reached, the remaining items are discarded until the buffer space becomes available. It is specified as a datasize.Size, has a default of 0
indicating no size limit, and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
buffer_max_bytesize: 2048
buffer_path
The buffer_path
parameter configures the path to store unsuccessful items. Unsuccessful items are stored there to be retried back (exactly once delivery). It is specified as a string and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
buffer_path: <path to unsuccessful items folder>
buffer_ttl
The buffer_ttl
parameter configures the time-to-Live for unsuccessful items, which indicates when to discard them. It is specified as a duration, has a default of 10m
, and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
buffer_ttl: 20m
credentials_path
The credentials_path
parameter specifies the absolute path to a GCP service account JSON file. This is used for authentication with GCS. If this field is omitted, the agent will attempt to use credentials from the environment. This enables support for GKE Workload Identity, allowing pods to authenticate securely without embedding secrets in the config. Authentication fallback logic:
- If
credentials_path
is defined, that file is used - If not set, the environment is checked (e.g., for Workload Identity on GKE)
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
buffer_ttl: 20m
disable_compaction
This parameter configures whether to disable compaction by the Compactor Agent for data from this node before it is sent to the data destination. It is specified as a boolean, the default is false
and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
disable_compaction: true
compression
The compression
parameter specifies the compression format. It can be gzip
, zstd
, snappy
or uncompressed
. It is specified as a string, has a default of gzip
, and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
compression: gzip | zstd | snappy | uncompressed
encoding
The encoding
parameter specifies the encoding format. It can be json
or parquet
. It is specified as a string, has a default of json
, and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
encoding: json | parquet
flush_interval
The flush_interval
parameter specifies the duration to flush (or force) data to the destination, including buffered data. It is specified as a duration and is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
flush_interval: 10m
max_byte_limit
The max_byte_limit
parameter specifies the maximum bytes before flushing buffered raw data to archive destination. It is specified with a data size and is optional. If not specified for this node the setting in the agent settings is used.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
max_byte_limit: 32MB
parallel_worker_count
Controls concurrency by defining how many internal workers run in parallel to write data to Google Cloud Storage. Increasing this value can improve throughput and reduce flush latency, especially when handling large volumes of log data or compressing and encoding files (e.g., Parquet or Avro) before upload. This setting influences how the agent packages and transmits data to GCS—not how GCS processes or stores it. Use this setting to:
- Optimize throughput in high-ingestion environments with sufficient CPU and I/O capacity.
- Reduce backlog during peak log volume or large batch processing.
- Enhance performance in pipelines performing heavy transformation or compression.
Avoid setting this value too high in resource-constrained environments, as increased concurrency may introduce CPU/memory contention or degrade overall pipeline efficiency.
nodes:
- name: my_gcs
type: gcs_output
bucket: <target bucket>
parallel_worker_count: 6
path_prefix
The path_prefix
parameter configures the path prefix using order
and format
child parameters. It is optional.
The order
child parameter lists the formatting items that will define the path prefix:
- You can refer to
Year
,Month
,Day
,<any number that can divide 60> Minute
,Hour
,tag
,host
,OtherTags.<item related tags>
andLogFields.<log related tags>
. - For ECS,
ecs_cluster
,ecs_container_name
,ecs_task_family
andecs_task_version
are available. - For K8s,
k8s_namespace
,k8s_controller_kind
,k8s_controller_logical_name
,k8s_pod_name
,k8s_container_name
andk8s_container_image
are available. - For Docker,
docker_container_name
anddocker_image_name
are available
The format
child parameter specifies a format string that has %s
as placeholders per each order item.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 2 Minute
- tag
- host
format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/
use_native_compression
The use_native_compression
parameter configures whether, for parquet encoding, to only compress data segments for each archive file, not the whole file. It is specified as a Boolean, has a default of false
, and it is optional.
nodes:
- name: <node name>
type: gcs_output
bucket: <target bucket>
use_native_compression: true
tls
The tls
parameter is a dictionary that configures TLS settings for secure connections to the destination. It is optional and typically used when connecting to endpoints that require encrypted transport (HTTPS) or mutual TLS.
nodes:
- name: <node name>
type: <destination type>
tls:
<tls options>
enabled
Specifies whether TLS is enabled. This is a Boolean value. Default is false
.
nodes:
- name: <node name>
type: <destination type>
tls:
enabled: true
ignore_certificate_check
Disables certificate verification. Useful for test environments. Default is false
.
nodes:
- name: <node name>
type: <destination type>
tls:
ignore_certificate_check: true
ca_file
Specifies the absolute path to a CA certificate file for verifying the remote server’s certificate.
nodes:
- name: <node name>
type: <destination type>
tls:
ca_file: /certs/ca.pem
ca_path
Specifies a directory containing one or more CA certificate files.
nodes:
- name: <node name>
type: <destination type>
tls:
ca_path: /certs/
crt_file
Path to the client certificate file for mutual TLS authentication.
nodes:
- name: <node name>
type: <destination type>
tls:
crt_file: /certs/client-cert.pem
key_file
Path to the private key file used for client TLS authentication.
nodes:
- name: <node name>
type: <destination type>
tls:
key_file: /certs/client-key.pem
key_password
Password for the TLS private key file, if required.
nodes:
- name: <node name>
type: <destination type>
tls:
key_password: <password>
client_auth_type
Controls how client certificates are requested and validated during the TLS handshake. Valid options:
noclientcert
requestclientcert
requireanyclientcert
verifyclientcertifgiven
requireandverifyclientcert
nodes:
- name: <node name>
type: <destination type>
tls:
client_auth_type: requireandverifyclientcert
max_version
Maximum supported version of the TLS protocol.
TLSv1_0
TLSv1_1
TLSv1_2
TLSv1_3
nodes:
- name: <node name>
type: <destination type>
tls:
max_version: TLSv1_3
min_version
Minimum supported version of the TLS protocol. Default is TLSv1_2
.
nodes:
- name: <node name>
type: <destination type>
tls:
min_version: TLSv1_2