Deploying the Edge Delta Lambda Forwarder

Deploy the Edge Delta Lambda Forwarder.

Overview

The Lambda Forwarder is a Lambda Function that collects AWS Lambda logs from Cloudwatch Log groups. See Serverless AWS Monitoring.

Note: the Edge Delta Lambda Extension is a preferred solution.

Create an Agent Configuration

Configure a pipeline for the hosted agent to ingest logs sent from lambda Forwarder, identify function resources, and populate lambda tags.

v3 Configuration Example

  1. Click Pipelines - Pipelines.
  2. Click New Pipeline.
  3. Click Continue.
  4. Select Helm and click Continue.
  5. Specify a name for the hosted agent configuration and click Create Configuration.
  6. Add a File Input with the following path: /var/captured_requests/body_*.json
  7. Using the YAML editor, specify line_pattern: '{"cloud":'
  8. Connect the File Input to a Parse JSON Attributes node.
  9. Configure the Parse JSON Attributes node with the process_field item.body.
  10. Connect Parse JSON Attributes to an Extract JSON Field node.
  11. Configure Extract JSON Field to extract the logEvents:[*] field path to assign to the body, and set keep_log_if_failed: true.
  12. Connect another Extract JSON Field.
  13. Configure Extract JSON Field to extract the whole message field path to assign to the body, and set keep_log_if_failed: true.
  14. Connect Extract JSON Field to a Log Transform node.
  15. Configure the Log Transform node to delete attributes.LogEvents (they are in body). Also delete attributes.timestamp.
  16. Click Pipelines - Pipelines
  17. Select the configuration, click the kebab (⋮) icon and select Edit YAML.
  18. Add the following agent settings:
  multiline_max_byte_size: 100KB
  max_incomplete_line_buffer_size: 1MB

The visual pipeline should be as follows:

An example v3 yaml follows:

version: v3

settings:
  tag: test-forwarder
  log:
    level: debug
  archive_flush_interval: 1m0s
  multiline_max_byte_size: 100KB
  max_incomplete_line_buffer_size: 1MB

links:
- from: ed_component_health
  to: ed_health
- from: ed_node_health
  to: ed_health
- from: file_input
  to: parse_json_attributes
- from: parse_json_attributes
  to: extract_json_field
- from: parse_json_attributes
  path: failure
  to: extract_json_field
- from: extract_json_field
  to: extract_json_field_e4ad
- from: extract_json_field
  path: failure
  to: extract_json_field_e4ad
- from: log_transform
  to: ed_archive
- from: extract_json_field_e4ad
  to: log_transform

nodes:
- name: ed_component_health
  type: ed_component_health_input
- name: ed_node_health
  type: ed_node_health_input
- name: ed_agent_stats
  type: ed_agent_stats_input
- name: ed_pipeline_io_stats
  type: ed_pipeline_io_stats_input
- name: ed_archive
  type: ed_archive_output
- name: ed_health
  type: ed_health_output
- name: file_input
  type: file_input
  path: /var/captured_requests/body_*.json
  line_pattern: '{"cloud":'
- name: parse_json_attributes
  type: parse_json_attributes
  process_field: item.body
- name: extract_json_field
  type: extract_json_field
  field_path: logEvents.[*]
  keep_log_if_failed: true
- name: log_transform
  type: log_transform
  transformations:
  - field_path: attributes.logEvents
    operation: delete
- name: extract_json_field_e4ad
  type: extract_json_field
  field_path: message
  keep_log_if_failed: true

v2 Configuration Example

  1. Click Pipelines - Legacy Pipelines.
  2. Click Create Configuration.
  3. Select Legacy and click Continue.
  4. Select Helm and click Continue.
  5. Specify a configuration name and click Create Configuration.
  6. Paste the following agent yaml, and update the tag to match the configuration name you specified in the previous step.
version: v2
agent_settings:
  tag: <configuration name>
  agent_stats_enabled: true
  multiline_max_bytesize: "100 KB"
  max_incomplete_line_buffer_size: "1MB"
  
inputs:
  files:
  - labels: lambda_forwarder_logs
    line_pattern: '{"cloud":'
    path: /var/captured_requests/body_*.json
    filters:
     - lambda-forwarder-input-lambda-source-detection-custom
     - lambda-forwarder-logstream-enrichment
     - message_extraction
    
filters:
  - name: lambda-forwarder-input-lambda-source-detection-custom
    type: source-detection
    source_type: "Custom"
    optional: true
    field_mappings:
      aws.log.group.names: logGroup
      faas.name: "faas.name"
      faas.version: "faas.version"
      cloud.resource_id: "cloud.resource_id"
      cloud.account.id: owner

  - name: lambda-forwarder-logstream-enrichment
    type: enrichment
    from_logs:
      field_mappings:
      - field_name: aws.log.stream.names
        json_path: logStream
    keep_log_if_failed: true
           
  - name: message_extraction
    type: extract-json-field
    field_path: "logEvents[*].message"
   
workflows:
  sample_workflow:
    input_labels:
      - lambda_forwarder_logs
  1. Click Done.

Create an Edge Delta Hosted Agent

Create a new Edge Delta Hosted agent. Choose the agent configuration you just created and select the HTTPS endpoint option.

  1. Click Pipelines in the Edge Delta Web App and select Hosted Agents.
  2. Click + Create.
  3. Enter a descriptive name for the hosted agent.
  4. Select an agent version. The interface lists the current stable version (the lower version number) and most recent candidate version (the higher version number). Choose the current stable version. If this configuration doesn’t work you can contact Edge Delta support to experiment with the candidate.
  5. Select a public cloud provider. If your preferred cloud provider is not listed, contact Edge Delta support.
  6. Select the region where the hosted agent will be hosted.
  7. For Config ID, there are 2 options:
  • select Generate one for me if you want to create a new agent configuration.

Note: this will generate an agent with a v2 configuration.

  • If you have already configured an agent and you want to reuse that configuration, select the configuration ID from the drop down.

Note: Specify a v3 configuration if you want to configure the hosted agent using Visual Pipelines.

  1. Select the HTTPS checkbox if you want your data source to push data to a secure HTTP endpoint on the hosted agent.
  2. Click Create.

Copy the HTTPS endpoint generated for the agent from the Hosted Agents table.

Deploying the Lambda Function

Get the ARN from the AWS Serverless Application Repository

  1. Open AWS Serverless Application Repository and click Available Applications.
  2. Select Show apps that create custom IAM roles or resource policies.
  3. Search for EdgeDelta and select the forwarder (either ARM64 or AMD64).
  4. Confirm the function template such as the application name
  5. Enter the HTTPS endpoint for the hosted agent that you copied earlier.
  6. Click I acknowledge that this app creates custom IAM roles and resource policies.
  7. Click Deploy.

The function can be deployed multiple times if necessary by providing different names. Deploying with an existing application name upgrades the existing deployment if an upgrade is available.

The following environment variables can be specified in the form or specified in using the Lambda console:

  • ED_ENDPOINT: Edge Delta hosted agent endpoint. (Required)
  • ED_FORWARD_FORWARDER_TAGS: If set to true, the forwarder Lambda’s own tags are fetched. This requires tag:GetResources and lambda:GetFunctionConfiguration permissions.
  • ED_FORWARD_LOG_GROUP_TAGS: If set to true, log group tags are fetched. Requires tag:GetResources permission.
  • ED_FORWARD_SOURCE_TAGS: If set to true, source log group’s tags are fetched. Forwarder tries to build an ARN of the source by using log group’s name. This requires tag:GetResources permission. If the source is lambda it also requires lambda:GetFunctionConfiguration permission and this only works if the log group name is in the correct format (i.e. /aws/lambda/<lambda_name>).
  • ED_PUSH_TIMEOUT_SEC: Push timeout is the total waiting duration between send batches of logs (in seconds). Default is 10.
  • ED_RETRY_INTERVAL_MS: RetryInterval is the initial interval to wait until the next retry (in milliseconds). It is increased exponentially until the Edge Delta process is shut down. Default is 100.

To configure Environment Variables for the Lambda function:

  1. Open the Functions page of the Lambda console.
  2. Choose a function.
  3. Choose Configuration, then choose Environment variables.
  4. Under Environment variables, choose Edit.
  5. Choose Add environment variable.
  6. Enter a key and value.

Assign Permissions

Assign Lambda Invoke Permission To AWS Logs Service in the CLI:

aws lambda add-permission \
    --function-name “<name_of_the_forwarder_lambda>” \
    --statement-id “<sid_for_policy>” \
    --principal “logs.amazonaws.com” \
    --action “lambda:InvokeFunction” \
    --source-arn “<arn_of_the_log_group_you_want_to_consume>” \
    --source-account ”<aws_account_id>” \

Subscribe the function to CloudWatch

Set up CloudWatch Logs subscription in the CLI:

aws lambda add-permission \
    --function-name “<name_of_the_forwarder_lambda>” \
    --statement-id “<sid_for_policy>” \
    --principal “logs.amazonaws.com” \
    --action “lambda:InvokeFunction” \
    --source-arn “<arn_of_the_log_group_you_want_to_consume>” \
    --source-account ”<aws_account_id>” \

Tag Fetching

Building a source ARN from log groups and log streams is not straightforward in AWS. Moreover, you can’t change log groups and stream names. The Forwarder is able to fetch lambda tags and SNS without any effort. However, Sagemaker log group and streams differ too much. In this instance the Forwarder is only able to fetch tags of the source of the log group and stream.

Building ARNs

The Forwarder builds ARNs with to these conventions:

  • ECS: The log configuration is defined in the Task Definition. A typical task definition is as follows:
        "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
                "awslogs-create-group": "true",
                "awslogs-group": "/ecs/test-cluster/-test-service",
                "awslogs-region": "us-west-2",
                "awslogs-stream-prefix": "ecs"
            }
        }

The Forwarder can have two log group conventions:

  • /ecs/{cluster_name}: Forwarder fetches ECS cluster tags

  • /ecs/{cluster_name}/{service_name}: Forwarder fetches ECS cluster and service tags.

  • EC2: Typically you install Cloudwatch Agent to the EC2 instance to send EC2 logs to Cloudwatch. You can specify a log group name and a stream name in the Cloudwatch Agent configuration. The Forwarder expects the following log group name to fetch tags of the EC2 instance: /ec2/instance/{instanceID} Additionally, the Forwarder can fetch VPC logs and VPC log groups can be specified. The Forwarder expects the following log group to be able to fetch tags of the VPC: /ec2/vpc/{vpcID}

  • Other Services: For other services, The Forwarder assumes the following format and tries to build the ARNs:

/aws/<service>/<resource_name> or /aws/<service>/<resource_type>/<resource_name>/...

ARNs: arn:aws:{service}:{region}:{account}:{resource_name} or arn:aws:{service}:{region}:{account}:{resource_name}/{resource_type}.....

Benchmarks

The Edge Delta Forwarder can process 10MB per minute continuously with 1 hosted agent.