Edge Delta S3 Source

Read data from an S3 source.

Overview

The S3 source node allows the Edge Delta agent to read data from an S3 bucket. This node is essential for ingesting log data stored in S3 and processing it within the Edge Delta ecosystem.

  • outgoing_data_types: log

Configure S3

See Prepare for an S3 Source for information on setting up your environment.

Example Edge Delta Pipeline configuration

Simple version

nodes:
- name: my_s3_input
  type: s3_input
  sqs_url: https://sqs.example-queue-123.amazonaws.com
  region: us-example-1

Advanced version

nodes:
- name: my_s3_input
  type: s3_input
  sqs_url: https://sqs.example-queue-123.amazonaws.com
  region: us-example-1
  aws_key_id: EXAMPLEAWSKEYID1234
  aws_sec_key: exampleAwsSecKey9876
  role_arn: arn:aws:iam::example-account-123:role/example-role
  external_id: example-external-id-5678

Required Parameters

name

A descriptive name for the node. This is the name that will appear in Visual Pipelines and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a - and a space followed by the string. It is a required parameter for all nodes.

nodes:
  - name: <node name>
    type: <node type>

type: s3_input

The type parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.

nodes:
  - name: <node name>
    type: <node type>

sqs_url

The sqs_url parameter is used for S3 event notifications. This parameter is specified as a string and is required.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>

region

The region parameter specifies the region where the S3 bucket and SQS are located. It is specified as a string and is required.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>

Optional Parameters

aws_key_id

The aws_key_id parameter is the AWS key ID that has all four IAM permissions to target the bucket. It is used with aws_sec_key. It is specified as a string and is optional.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  aws_key_id: <key>
  aws_sec_key: <secure key>

aws_sec_key

The aws_sec_key parameter is the AWS secret key ID that has all four IAM permissions to target the bucket. It is used with aws_key_id. It is specified as a string and is optional.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  aws_key_id: <key>
  aws_sec_key: <secure key>

compression

The compression parameter is used to define the compression type for incoming logs. You can specify gzip, zstd, snappy, or uncompressed. It is specified as a string. It is optional and the default is uncompressed.

nodes:
- name: s3_input
  type: s3_input
  region: us-west-2
  sqs_url: <REDACTED>
  compression: gzip

role_arn

The role_arn parameter is used if authentication and authorization is performed using an assumed AWS IAM role. It should consist of the account ID and role name. A role_arn is optional for a data destination depending on the access configuration.

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  role_arn: <role ARN>

external_id

The external_id parameter is a unique identifier to avoid a confused deputy attack. It is specified as a string and is optional. While external_id is optional, when configured it must be used with role_arn

nodes:
- name: <node name>
  type: s3_input
  sqs_url: <sqs to subscribe>
  region: <aws region>
  external_id: <ID>
  role_arn: <role ARN>

For advance authentication options, please check AWS IAM Role Authentication.