Edge Delta Amazon S3 Output
5 minute read
See the latest version here.
Overview
This output type sends logs to an AWS S3 endpoint.
Create an IAM User and Attach a Custom Policy
Before you configure your Edge Delta account to sends logs to an AWS S3 endpoint, you must first access the AWS console to:
- Create an IAM user to access the AWS S3 bucket. To learn how to create an IAM user, review this document from AWS.
- Attach the custom policy to the newly created IAM user. To learn how to create and add a custom policy, review this document from AWS.
The custom policy lists 3 permissions:
PutObject
GetObject
ListBucket
If you want to create an S3 archive for rehydration purposes only, then at a minimum, your custom policy must include GetObject
.
All other permissions are only required for archiving purposes. As a result, if you prefer, you can create 2 different S3 archive integrations with different custom policies.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<account-number>:role/<role-name>"
},
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::bucket-name",
"arn:aws:s3:::bucket-name/*"
]
}
]
}
Example
outputs:
archives:
- name: my-s3
type: s3
aws_key_id: '{{ Env "AWS_KEY_ID" }}'
aws_sec_key: '{{ Env "AWS_SECRET_KEY" }}'
bucket: testbucket
region: us-east-2
- name: my-s3-assumes-role
type: s3
role_arn: "arn:aws:iam::1234567890:role/ed-s3-archiver-role"
external_id: "053cf606-8e80-47bf-b849-8cd1cc826cfc"
bucket: testbucket
region: us-east-2
- name: my-s3-archiver
type: s3
aws_key_id: '{{ Env "AWS_KEY_ID" }}'
aws_sec_key: '{{ Env "AWS_SECRET_KEY" }}'
bucket: testbucket
region: us-east-2
disable_metadata_ingestion: true
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 5 Minute
- OtherTags.role
format: year=%s/month=%s/day=%s/hour=%s/minute=%s/role=%s/
Parameters
name
Required
Enter a descriptive name for the output or integration.
For outputs, this name will be used to map this destination to a workflow.
name: s3
integration_name
Optional
This parameter refers to the organization-level integration created in the Integrations page.
If you need to add multiple instances of the same integration into the config, then you can add a custom name to each instance via the name parameter. In this situation, the name should be used to refer to the specific instance of the destination in the workflows.
integration_name: orgs-aws-s3
type: s3
Required
Enter s3.
type: s3
bucket
Required
Enter the target S3 bucket to send the archived logs.
bucket: "testbucket"
region
Required
Enter the specified S3 bucket’s region.
region: "us-east-2"
path_prefix
The path_prefix parameter is used to override the default path structure of <Year>/<Month>/<Day>/<Hour>/<Tag>/. The following tags can be used:
- “Year”
- “Month”
- “Day”
- “<any number that can divide 60> Minute”
- “Hour”
- “Tag”
- “Host”
- “OtherTags.
” - “LogTags.
”
Amazon Elastic Container Service:
- “ECSCluster”
- “ECSContainerName”
- “ECSTaskFamily”
- “ECSTaskVersion”
Kubernetes:
- “K8sNamespace”, “K8sControllerKind”, “K8sControllerLogicalName”, “K8sPodName”, “K8sContainerName” and “K8sContainerImage”
Docker:
- “DockerContainerName”
- “DockerImageName”
The order
child parameter is used to define the path structure.
The format
child parameter should have exactly same amount of “%s"s as order
count and templating will be done using order
.
Curly braces are prohibited and this format is not supported in rehydrations so the source for rehydration cannot be an integration using a custom
path_prefix
format. The following format should be used for some Big Data applications such BigQuery, AWS Athena etc:format: year=%s/month=%s/day=%s/hour=%s/minute=%s/role=%s/
outputs:
archives:
- name: <archive name>
type: s3
aws_key_id: '{{ Env "AWS_KEY_ID" }}'
aws_sec_key: '{{ Env "AWS_SECRET_KEY" }}'
bucket: <bucket name>
region: <region>
disable_metadata_ingestion: true|false
path_prefix:
order:
- Year
- Month
- Day
- Hour
- 5 Minute
- OtherTags.role
format: year=%s/month=%s/day=%s/hour=%s/minute=%s/role=%s/
aws_key_id
Optional
Enter the AWS key ID that has the PutObject permission to target the bucket. If you use role-based AWS authentication where keys are not provided, then you should keep this field empty; however, you must still attach the custom policy.
aws_key_id: '{{ Env "TEST_AWS_KEY_ID" }}'
aws_sec_key
Optional
Enter the AWS secret key ID that has the PutObject permission to target the bucket. If you use role-based AWS authentication where keys are not provided, then you should keep this field empty; however, you must still attach the custom policy.
aws_sec_key: "awssecret123"
role_arn
Optional
Enter the ARN that has permissions to use the desired IAM Role
role_arn: "arn:aws:iam::1234567890:role/ed-s3-archiver-role"
external_id
Optional
Enter the external ID associated with the desired IAM role.
external_id: "053cf606-8e80-47bf-b849-8cd1cc826cfc"
compression
Optional
Enter a compression type for archiving purposes.
You can enter gzip, zstd, snappy, or uncompressed.
compression: gzip
encoding
Optional
Enter an encoding type for archiving purposes.
You can enter json or parquet.
encoding: parquet
use_native_compression
Optional
Enter true or false to compress parquet-encoded data.
This option will not compress metadata.
This option can be useful with big data cloud applications, such as AWS Athena and Google BigQuery.
To use this parameter, you must set the encoding parameter to parquet.
use_native_compression: true
buffer_ttl
Optional
Enter a length of time to retry failed streaming data.
After this length of time is reached, the failed streaming data will no longer be tried.
buffer_ttl: 2h
buffer_path
Optional
Enter a folder path to temporarily store failed streaming data.
The failed streaming data will be retried until the data reaches its destinations or until the Buffer TTL value is reached.
If you enter a path that does not exist, then the agent will create directories, as needed.
buffer_path: /var/log/edgedelta/pushbuffer/
buffer_max_bytesize
Optional
Enter the maximum size of failed streaming data that you want to retry.
If the failed streaming data is larger than this size, then the failed streaming data will not be retried.
buffer_max_bytesize: 100MB
disable_metadata_ingestion
Optional
Enter true or false to disable metadata file ingestion.
Typically, metadata is used for rehydration analysis.
disable_metadata_ingestion: true