Edge Delta S3 Source

The Edge Delta S3 Source node allows reading and processing log data from an S3 bucket within the Edge Delta ecosystem.

Overview

The S3 source node allows the Edge Delta agent to read data from an S3 bucket. This node is essential for ingesting log data stored in S3 and processing it within the Edge Delta ecosystem.

  • outgoing_data_types: log

Configure S3

See Prepare for an S3 Source for information on setting up your environment.

Example Edge Delta Pipeline configuration

Simple version

nodes:
- name: my_s3_input
  type: s3_input
  sqs_url: https://sqs.example-queue-123.amazonaws.com
  region: us-example-1

Advanced version

nodes:
- name: my_s3_input
  type: s3_input
  sqs_url: https://sqs.example-queue-123.amazonaws.com
  region: us-example-1
  aws_key_id: EXAMPLEAWSKEYID1234
  aws_sec_key: exampleAwsSecKey9876
  role_arn: arn:aws:iam::example-account-123:role/example-role
  external_id: example-external-id-5678

Cross-region configuration

When your S3 bucket and SQS queue are in different AWS regions, use the s3_config and sqs_config parameters to specify region-specific settings:

nodes:
- name: my_s3_input
  type: s3_input
  sqs_url: https://sqs.us-west-2.amazonaws.com/123456789/my-queue
  region: us-east-1
  s3_config:
    region: us-west-2
  sqs_config:
    region: us-east-1

Required Parameters

name

A descriptive name for the node. This is the name that will appear in pipeline builder and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a - and a space followed by the string. It is a required parameter for all nodes.

nodes:
  - name: <node name>
    type: <node type>

type: s3_input

The type parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.

nodes:
  - name: <node name>
    type: <node type>

sqs_url

The sqs_url parameter is used for S3 event notifications. This parameter is specified as a string and is required.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>

region

The region parameter specifies the region where the S3 bucket and SQS are located. It is specified as a string and is required.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>

Optional Parameters

aws_key_id

The aws_key_id parameter is the AWS key ID that has all four IAM permissions to target the bucket. It is used with aws_sec_key. It is specified as a string and is optional.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  aws_key_id: <key>
  aws_sec_key: <secure key>

aws_sec_key

The aws_sec_key parameter is the AWS secret key ID that has all four IAM permissions to target the bucket. It is used with aws_key_id. It is specified as a string and is optional.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  aws_key_id: <key>
  aws_sec_key: <secure key>

compression

The compression parameter is used to define the compression type for incoming logs. You can specify gzip, zstd, snappy, or uncompressed. It is specified as a string. It is optional and the default is uncompressed.

nodes:
- name: s3_input
  type: s3_input
  region: us-west-2
  sqs_url: <REDACTED>
  compression: gzip

role_arn

The role_arn parameter is used if authentication and authorization is performed using an assumed AWS IAM role. It should consist of the account ID and role name. A role_arn is optional for a data destination depending on the access configuration.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  role_arn: <role ARN>

external_id

The external_id parameter is a unique identifier to avoid a confused deputy attack. It is specified as a string and is optional. While external_id is optional, when configured it must be used with role_arn

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  external_id: <ID>
  role_arn: <role ARN>

s3_config

The s3_config parameter allows you to specify AWS configuration specific to the S3 service. When provided, these settings override the base-level region, aws_key_id, aws_sec_key, role_arn, and external_id parameters for S3 operations only. This is useful for cross-region deployments where your S3 bucket is in a different region than your SQS queue, or when S3 requires different authentication credentials. It is specified as a nested configuration block and is optional.

The s3_config block supports the following fields:

  • region - AWS region for S3 access
  • aws_key_id - AWS access key ID for S3 (optional if using role-based authentication)
  • aws_sec_key - AWS secret access key for S3 (optional if using role-based authentication)
  • role_arn - IAM role ARN for S3 access (alternative to access keys)
  • external_id - External ID for role assumption (required when role_arn is specified)
nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <base aws region>
  s3_config:
    region: <s3 specific region>
    aws_key_id: <s3 key>
    aws_sec_key: <s3 secret>

sqs_config

The sqs_config parameter allows you to specify AWS configuration specific to the SQS service. When provided, these settings override the base-level region, aws_key_id, aws_sec_key, role_arn, and external_id parameters for SQS operations only. This is useful for cross-region deployments where your SQS queue is in a different region than your S3 bucket, or when SQS requires different authentication credentials. It is specified as a nested configuration block and is optional.

The sqs_config block supports the following fields:

  • region - AWS region for SQS access
  • aws_key_id - AWS access key ID for SQS (optional if using role-based authentication)
  • aws_sec_key - AWS secret access key for SQS (optional if using role-based authentication)
  • role_arn - IAM role ARN for SQS access (alternative to access keys)
  • external_id - External ID for role assumption (required when role_arn is specified)
nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <base aws region>
  sqs_config:
    region: <sqs specific region>
    aws_key_id: <sqs key>
    aws_sec_key: <sqs secret>

source_metadata

The source_metadata parameter is used to define which detected resources and attributes to add to each data item as it is ingested by the Edge Delta agent. In the GUI you can select:

  • Required Only: This option includes the minimum required resources and attributes for Edge Delta to operate.
  • Default: This option includes the required resources and attributes plus those selected by Edge Delta
  • High: This option includes the required resources and attributes along with a larger selection of common optional fields.
  • Custom: With this option selected, you can choose which attributes and resources to include. The required fields are selected by default and can’t be unchecked.

Based on your selection in the GUI, the source_metadata YAML is populated as two dictionaries (resource_attributes and attributes) with Boolean values.

See Choose Data Item Metadata for more information on selecting metadata.

For advance authentication options, please check AWS IAM Role Authentication.