Edge Delta Azure Blob Output
5 minute read
Overview
The Azure Blob Output Node send items to an Azure Blob destination. The items are raw archive bytes buffered with the archive buffer processor.
- incoming_data_types: log
Example Configuration
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
auto_create_container: true
compression: zstd
encoding: parquet
use_native_compression: true
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 2 Minute
- tag
- host
format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/
Required Parameters
name
A descriptive name for the node. This is the name that will appear in Visual Pipelines and you can reference this node in the yaml using the name. It must be unique across all nodes. It is a yaml list element so it begins with a -
and a space followed by the string. It is a required parameter for all nodes.
nodes:
- name: <node name>
type: <node type>
type: blob_output
The type
parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.
nodes:
- name: <node name>
type: <node type>
container
The container
parameter is used to specify the target container. It is specified as a string and it is required.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
account_name
The account_name
parameter is used to specify the Azure account name. It is specified as a string and it is required.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
account_key
The account_key
parameter is used to specify the key for the Azure account. It is specified as a string and it is required.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
Optional Parameters
auto_create_container
The auto_create_container
parameter configures whether to create the container if it does not exist. It is specified as a Boolean, has a default of false
, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
auto_create_container: true
buffer_max_bytesize
The buffer_max_bytesize
parameter configures the maximum byte size for total unsuccessful items. If the limit is reached, the remaining items are discarded until the buffer space becomes available. It is specified as a datasize.Size, has a default of 0
indicating no size limit, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
buffer_max_bytesize: 2048
buffer_path
The buffer_path
parameter configures the path to store unsuccessful items. Unsuccessful items are stored there to be retried back (exactly once delivery). It is specified as a string and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
buffer_path: <path to unsuccessful items folder>
buffer_ttl
The buffer_ttl
parameter configures the time-to-Live for unsuccessful items, which indicates when to discard them. It is specified as a duration, has a default of 10m
, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
buffer_ttl: 20m
compression
The compression
parameter specifies the compression format. It can be gzip
, zstd
, snappy
or uncompressed
. It is specified as a string, has a default of gzip
, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
compression: gzip | zstd | snappy | uncompressed
disable_compaction
This parameter configures whether to disable compaction by the compactor agent for data from this node before it is sent to the data destination. It is specified as a boolean, the default is false
and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
disable_compaction: true
encoding
The encoding
parameter specifies the encoding format. It can be json
or parquet
. It is specified as a string, has a default of json
, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
encoding: json | parquet
flush_interval
The flush_interval
parameter specifies the duration to flush (or force) data to the destination, including buffered data. It is specified as a duration and is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
flush_interval: 10m
max_byte_limit
The max_byte_limit
parameter specifies the maximum bytes before flushing buffered raw data to archive destination. It is specified with a data size and is optional. If not specified for this node the setting in the agent settings is used.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
max_byte_limit: 32MB
path_prefix
The path_prefix
parameter configures the path prefix using order
and format
child parameters. It is optional.
The order
child parameter lists the formatting items that will define the path prefix:
- You can refer to
Year
,Month
,Day
,<any number that can divide 60> Minute
,Hour
,tag
,host
,OtherTags.<item related tags>
andLogFields.<log related tags>
. - For ECS,
ecs_cluster
,ecs_container_name
,ecs_task_family
andecs_task_version
are available. - For K8s,
k8s_namespace
,k8s_controller_kind
,k8s_controller_logical_name
,k8s_pod_name
,k8s_container_name
andk8s_container_image
are available. - For Docker,
docker_container_name
anddocker_image_name
are available
The format
child parameter specifies a format string that has %s
as placeholders per each order item.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 2 Minute
- tag
- host
format: ver=parquet/year=%s/month=%s/day=%s/hour=%s/min=%s/tag=%s/host=%s/
use_native_compression
The use_native_compression
parameter configures whether, for parquet encoding, to only compress data segments for each archive file, not the whole file. It is specified as a Boolean, has a default of false
, and it is optional.
nodes:
- name: my_blob
type: blob_output
container: <REDACTED>
account_name: <REDACTED>
account_key: <REDACTED>
use_native_compression: true