Deploying the Edge Delta Lambda Forwarder

Deploy the Edge Delta Lambda Forwarder.

Overview

The Lambda Forwarder is a Lambda Function that collects AWS Lambda logs from Cloudwatch Log groups. See Serverless AWS Monitoring.

Note: the Edge Delta Lambda Extension is a preferred solution.

Create an Edge Delta Cloud Fleet

Create a new Edge Delta Cloud Fleet. Choose a new Pipeline configuration and ensure the HTTPS endpoint option is selected.

  1. Click Pipelines.
  2. Click New Fleet.
  3. Select Cloud Fleet
  4. Optionally, expand Advanced fleet configuration and choose a pipeline configuration to duplicate for the cloud fleet. If you don’t select one, a default configuration will be used.
  5. Click Continue.
  6. Specify a name to identify the Fleet.
  7. Optionally, expand Advanced Settings and select Compute Units based on your estimated traffic volume. This is the maximum bandwidth the agent can handle before signalling an error. The number of compute units used per hour counts towards your plan usage.
  8. Optionally, specify an agent version. The interface lists the current stable version (the latest version number) and most recent candidate version (containing rc). Choose the current stable version. If this configuration doesn’t work you can contact Edge Delta support to experiment with the candidate.
  9. Optionally, select protocols for endpoints on the Cloud Fleet that your sources can push data to. The default is an HTTPS endpoint.
  10. Click Deploy Cloud Fleet.

Copy the HTTPS endpoint from the Cloud Fleets table.

Modify the Pipeline Configuration

Configure the pipeline for the Cloud Fleet as follows:

  1. Feed traffic from the HTTP source to a JSON Unroll node. This node will transform structured JSON logs by breaking down nested JSON array objects into individual log entries.
  • Set the new_field_name parameter to logEvents. This specifies the field under which the contents of each unrolled log will be placed in the resulting output. Each portion of the original array becomes a new log with a top-level field named logEvents.
  • Set the json_field_path also to logEvents. This parameter locates the exact JSON array field that needs to be processed - the incoming JSON object has a key logEvents whose array value will be unrolled.
  1. Connect the JSON Unroll node to a Parse JSON node. This node will parse each structured JSON log’s item["body"] into attributes.
  2. Connect the Parse JSON node to a Log Transform node. This node will use the log’s original timestamp as the data item timestamp, and also delete the logEvent attribute. Create an upsert operation group with the field path of item["timestamp"]. This field should be updated with a value from json(item["body"]).logEvent.timestamp. Then create a delete operation group with a delete operation for the item["attributes"]["logEvent"] field path.
  3. Connect the Log Transform node to an Extract JSON Field node. This node will extract the content of the message field from within a logEvent JSON object and use it as the log’s body field. Specify the field_path as logEvent.message.

The visual pipeline should be as follows:

An example YAML follows:

version: v3

settings:
  tag: test-forwarder
  log:
    level: info
  archive_flush_interval: 1m0s
  archive_max_byte_limit: 16MB

links:
- from: ed_component_health
  to: ed_health
- from: ed_node_health
  to: ed_health
- from: HTTP Source
  to: JSON Unroll Processor
- from: JSON Unroll Processor
  to: Parse JSON Attributes_6bd0
- from: Parse JSON Attributes_6bd0
  to: timestamp_and_remove_log_event_from_attributes
- from: timestamp_and_remove_log_event_from_attributes
  to: use_normal_message
- from: use_normal_message
  to: ed_archive

nodes:
- name: ed_component_health
  type: ed_component_health_input
- name: ed_node_health
  type: ed_node_health_input
- name: ed_agent_stats
  type: ed_agent_stats_input
- name: ed_pipeline_io_stats
  type: ed_pipeline_io_stats_input
- name: ed_archive
  type: ed_archive_output
- name: ed_health
  type: ed_health_output
- name: HTTP Source
  type: http_input
  port: 80
  read_timeout: 1m0s
- name: JSON Unroll Processor
  type: json_unroll
  new_field_name: logEvents
  json_field_path: logEvents
- name: Parse JSON Attributes
  type: parse_json_attributes
  process_field: item["body"]
- name: timestamp_and_remove_log_event_from_attributes
  type: log_transform
  transformations:
  - field_path: item["timestamp"]
    operation: upsert
    value: json(item["body"]).logEvent.timestamp
  - field_path: item["attributes"]["logEvent"]
    operation: delete
- name: use_normal_message
  type: extract_json_field
  field_path: logEvent.message

Deploying the Lambda Function

Get the ARN from the AWS Serverless Application Repository

  1. Open AWS Serverless Application Repository and click Available Applications.
  2. Select Show apps that create custom IAM roles or resource policies.
  3. Search for EdgeDelta and select the forwarder (either ARM64 or AMD64).
  4. Confirm the function template such as the application name
  5. Enter the HTTPS endpoint for the Cloud Fleet that you copied earlier.
  6. Click I acknowledge that this app creates custom IAM roles and resource policies.
  7. Click Deploy.

The function can be deployed multiple times if necessary by providing different names. Deploying with an existing application name upgrades the existing deployment if an upgrade is available.

The following environment variables can be specified in the form or specified in using the Lambda console:

  • ED_ENDPOINT: Edge Delta Cloud Fleet endpoint. (Required)
  • ED_FORWARD_FORWARDER_TAGS: If set to true, the forwarder Lambda’s own tags are fetched. This requires tag:GetResources and lambda:GetFunctionConfiguration permissions.
  • ED_FORWARD_LOG_GROUP_TAGS: If set to true, log group tags are fetched. Requires tag:GetResources permission.
  • ED_FORWARD_SOURCE_TAGS: If set to true, source log group’s tags are fetched. Forwarder tries to build an ARN of the source by using log group’s name. This requires tag:GetResources permission. If the source is lambda it also requires lambda:GetFunctionConfiguration permission and this only works if the log group name is in the correct format (i.e. /aws/lambda/<lambda_name>).
  • ED_PUSH_TIMEOUT_SEC: Push timeout is the total waiting duration between send batches of logs (in seconds). Default is 10.
  • ED_RETRY_INTERVAL_MS: RetryInterval is the initial interval to wait until the next retry (in milliseconds). It is increased exponentially until the Edge Delta process is shut down. Default is 100.

To configure Environment Variables for the Lambda function:

  1. Open the Functions page of the Lambda console.
  2. Choose a function.
  3. Choose Configuration, then choose Environment variables.
  4. Under Environment variables, choose Edit.
  5. Choose Add environment variable.
  6. Enter a key and value.

Assign Permissions

Assign Lambda Invoke Permission To AWS Logs Service in the CLI:

aws lambda add-permission \
    --function-name “<name_of_the_forwarder_lambda>” \
    --statement-id “<sid_for_policy>” \
    --principal “logs.amazonaws.com” \
    --action “lambda:InvokeFunction” \
    --source-arn “<arn_of_the_log_group_you_want_to_consume>” \
    --source-account ”<aws_account_id>” \

Subscribe the function to CloudWatch

Set up CloudWatch Logs subscription in the CLI:

aws logs put-subscription-filter \
    --log-group-name “<the_log_group_you_want_to_consume>” \
    --filter-name “<name_of_the_filter_just_for_display_purpose>” \
    --filter-pattern “<filter_pattern_for_logs_if_needed_to_send_logs_matching_with_pattern>” \
    --destination-arn “<arn_of_the_forwarder_lambda>” 

Tag Fetching

Building a source ARN from log groups and log streams is not straightforward in AWS. Moreover, you can’t change log groups and stream names. The Forwarder is able to fetch lambda tags and SNS without any effort. However, Sagemaker log group and streams differ too much. In this instance the Forwarder is only able to fetch tags of the source of the log group and stream.

Building ARNs

The Forwarder builds ARNs with to these conventions:

  • ECS: The log configuration is defined in the Task Definition. A typical task definition is as follows:
        "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
                "awslogs-create-group": "true",
                "awslogs-group": "/ecs/test-cluster/-test-service",
                "awslogs-region": "us-west-2",
                "awslogs-stream-prefix": "ecs"
            }
        }

The Forwarder can have two log group conventions:

  • /ecs/{cluster_name}: Forwarder fetches ECS cluster tags

  • /ecs/{cluster_name}/{service_name}: Forwarder fetches ECS cluster and service tags.

  • EC2: Typically you install Cloudwatch Agent to the EC2 instance to send EC2 logs to Cloudwatch. You can specify a log group name and a stream name in the Cloudwatch Agent configuration. The Forwarder expects the following log group name to fetch tags of the EC2 instance: /ec2/instance/{instanceID} Additionally, the Forwarder can fetch VPC logs and VPC log groups can be specified. The Forwarder expects the following log group to be able to fetch tags of the VPC: /ec2/vpc/{vpcID}

  • Other Services: For other services, The Forwarder assumes the following format and tries to build the ARNs:

/aws/<service>/<resource_name> or /aws/<service>/<resource_type>/<resource_name>/...

ARNs: arn:aws:{service}:{region}:{account}:{resource_name} or arn:aws:{service}:{region}:{account}:{resource_name}/{resource_type}.....

Benchmarks

The Edge Delta Forwarder can process 10MB per minute continuously with 1 Cloud Fleet.