Edge Delta Enrichment Filter

Modify data as it is being streamed through the Edge Delta agent.

See the latest version here.

An enrichment filter modifies data as it is being streamed through the Edge Delta agent. You can configure it to, for example add K8s attributes or cloud metadata to a new or existing field to make troubleshooting easier; or transform a field to match a particular data format.

Where the value of the enrichment is determined based on a field in the logs, only the first log received is used to determine the value. That value is used for the enrichment for all subsequent logs in the same source.

Enrichment Filters

Enrichment filters are configured by specifying a field and the logic to enrich it. Enrichment can add the field to the log if it doesn’t exist or update the values for an existing field. There are several types of enrichment filters:

  • from_path
  • from_k8s
  • dynamic
  • from_logs
  • failure_behavior

from_path

The from_path enrichment filter is used to add or update a specific field with data extracted from a path that you define using a regex path pattern. The pattern must be a capture pattern for only one capture group:

filters:
  - name: enrichment-full
    type: enrichment
    from_path:
      field_mappings:
        - field_name: application
          pattern: /var/logs/anyDir/(?:(.+)/)?users/.*

from_k8s

The from_k8s enrichment filter is used to enrich logs with Kubernetes object attributes determined using the first log recieved. You limit the filter to a specific pod with the pod_identifier_pattern regex pattern. The required changes are specified in the field_mappings section per field name. You specify node, namespace, or pod as the K8s object attributes and provide the source and target data. You also provide an operation type to perform on the source to generate the target. The type can be replace to find the source instances and replace them with the target, or regex, indicating that the source is a regex pattern, and all matches should be replaced with the target value. In the following example, where a log is from a pod type resource, from the specified pod location (the pod_identifier_pattern), and there is an instance_id field, all instances of dashes will be replaced with underscores in that field. In addition, any regex matches for the test* pattern will be removed. The scope can be broadened to other resources such as an entire namespace or a service. Note that fields from labels should have pod_attribute start with labels, such as labels.service.

filters:
  - name: enrichment-full
    type: enrichment
    from_k8s:
      pod_identifier_pattern: /var/logs/anyDir/MyApp/users/(?:(.+)/)/.*
      field_mappings:
        - field_name: instance_id
          pod_attribute: pod
          transformers:
            - source: "-"
              target: "_"
              type: "replace"
            - source: "test*"
              target: ""
              type: "regex"
        - field_name: namespace
          pod_attribute: namespace
        - field_name: service
          pod_attribute: labels.service

dynamic

Dynamic fields are populated from other existing fields in the first log. The field mapping specifies a field_name and a value, which is either a static value or a valid field_name reference in a text template format. Any field names used as values in a dynamic field must be already defined elsewhere. Fields that are dependencies for dynamic fields should be defined first. In the following example, the dynamic values from the field names application and service, defined earlier in the from_k8s section, are used to define the tag field. The tag field is then referred to in the version value. A static field is defined with the name static_field and it is used in the derived_from_static_field value.

filters:
  - name: enrichment-full
    type: enrichment
    dynamic:
      field_mappings:
      - field_name: tag
        value: "tail.{{.application}}.{{.service}}"
      - field_name: version
        value: "v.0.1.13.{{.tag}}"
      - field_name: static_field
        value: "static_value"
      - field_name: derived_from_static_field
        value: "derived_from_static.{{.static_field}}"

Dynamic Enrichment from AWS and GCP**

You can draw dynamic instance metadata from an AWS or GCP instance by starting the field value with aws-instance or gcp.

filters:
  - name: enrichment-aws
    type: enrichment
    dynamic:
      field_mappings:
        - field_name: "instance_id"
          value: '{{".aws-instance.instance-id"}}'
        - field_name: "instance_type"
          value: '{{".aws-instance.instance-type"}}'
        - field_name: "cluster_name"
          value: '{{".aws-instance.cluster-name"}}'
        - field_name: "ec2launchtemplate_id"
          value: '{{".aws-instance.ec2launchtemplate-id"}}'
        - field_name: "ec2launchtemplate_version"
          value: '{{".aws-instance.ec2launchtemplate-version"}}'
        - field_name: "inspector_enabled"
          value: '{{".aws-instance.inspector-enabled"}}'
        - field_name: "cluster_autoscaler_enabled"
          value: '{{".aws-instance.cluster-autoscaler-enabled"}}'
        - field_name: "autoscaling_groupName"
          value: '{{".aws-instance.autoscaling-groupName"}}'
        - field_name: "nodegroup_name"
          value: '{{".aws-instance.nodegroup-name"}}'
        - field_name: "ec2_fleet_id"
          value: '{{".aws-instance.ec2-fleet-id"}}'
  - name: enrichment-gcp
    type: enrichment
    dynamic:
      field_mappings:
        - field_name: "project_id"
          value: '{{".gcp.project.project-id"}}'
        - field_name: "hostname"
          value: '{{".gcp.instance.hostname"}}'
        - field_name: "zone"
          value: '{{".gcp.instance.zone"}}'
        - field_name: "instance_id"
          value: '{{".gcp.instance.id"}}'
        - field_name: "instance_name"
          value: '{{".gcp.instance.name"}}'
        - field_name: "instance_tags"
          value: '{{".gcp.instance.tags"}}'
        - field_name: "cluster_name"
          value: '{{".gcp.instance.attributes.cluster-name"}}'
        - field_name: "gcp_image_tag"
          value: '{{".gcp.instance.image"}}'
        - field_name: "gcp_dev_name"
          value: '{{".gcp.instance.disks.0.device-name"}}'

For more information about instance metadata retrieval see here for AWS and here for GCP (external links).

from_logs

The from_logs parameter is used to enrich logs with data extracted from the first log received for each source. Field mappings can be either a regex pattern with one capture group or a JSON path. You can specify a fallback value if the JSON path enrichment fails.

filters:
  - name: enrichment-full
    type: enrichment
    from_logs:
      field_mappings:
        - field_name: podname
          pattern: "podname: (\\w+)"
        - field_name: component
          json_path: fields.[1].component
          fallback_value: servicehandler

failure_behavior

If the enrichment fails, you can specify an action to take using the failure_behavior parameter. The skip_failing_fields value is the default behavior. You can set it to stop_processing, or, if the enrichment is not from_logs, you can set it to drop_source.

filters:
  - name: enrichment-failure-behavior
    type: enrichment
    failure_behavior: stop_enrichment
    dynamic:
      field_mappings:
        - field_name: "service"
          value: '{{".labels.service"}}'
        - field_name: "source"
          value: '.annotations.kubernetes.io/{{.container_name}}.logs'
          json_path: "[0].source"
          fallback_value: '{{".short_container_image"}}'
        - field_name: '[[if eq .controllerKind "replicaset"]]kube_deployment[[else]]kube_[[.controllerKind]][[end]]'
          value: "{{.controllerName}}"

Enrichment Filter Example

The following configuration enriches logs to conform to a Datadog schema. It does this by adding tags with data drawn from the pod output and AWS metadata in the first log processed. This filter is specified in an input specification for a Kubernetes environment. The input is included in a workflow to complete the configuration that connects the input with the output along with a processor.

workflows:
  cluster-all:
    input_labels:
    - web
    processors:
    - clustering
    destinations:
    - my-datadog
    
inputs:
  kubernetes:
  - labels: web
    include:
      - "namespace=web"
     filters:
      - enrichment-dd
      
filters:
  - name: enrichment-dd
    type: enrichment
    dynamic:
      field_mappings:
        - field_name: instance_id
          value: .aws-instance.instance-id
        - field_name: host
          value: .aws-instance.instance-id
        - field_name: instance_type
          value: .aws-instance.instance-type
        - field_name: service
          value: .annotations.ad.datadoghq.com/{{.container_name}}.logs
          json_path: '[0].service'
          fallback_value: '{{.short_container_image}}'
        - field_name: cluster
          value: .aws-instance.cluster-name
        - field_name: container_id
          value: '{{.docker_id}}'
        - field_name: image_name
          transformers:
          - source: :\S+
            target: ""
            type: regex
          value: '{{.container_image}}'
        - field_name: image_tag
          transformers:
          - source: '\S+:'
            target: ""
            type: regex
          value: '{{.container_image}}'

Suppose an agent with this configuration received the following log:

2022-10-13 14:41:45 ERROR ReceiveBoxesAsync Exception with message: Can't create Mocha context: Error {MochaNotFound}, Exception : System.Exception: Can't create Mocha context: Error {MochaNotFound}

The filter will enrich it by adding parameters for Datadog as follows:

cluster:sandbox-me,
container_id:123a456b789d123e456f789getc,
container_image:username/image:latest,
container_name:mocha,
controllerkind:deployment,
controllerlogicalname:mocha,
controllername:mocha-1abc23d456,
datadog.index:main,deployment,
docker_id:123a456b789d123e456f789getc123a456b789d123e456f789getc123a4etc,
ed_tag:development,
edgedelta_datatype:cluster_sample,
host:123a456b789d123etc,
image_name:username/image,
image_tag:latest,
instance_id:123a456f789getc,
instance_type:t2.medium,
integration:edgedelta,
labels.k8s-app:mocha,
labels.pod-template-hash:123aetc,
labels.version:v1,
logicalsource:k8s,mocha,
namespace_name:v1env,
pod_id:123a456b789d123e456f789getc123a4etc,
pod_name:mocha-123a456b789d1etc,
service:some-service,
source:v1env_mocha-1233456f789getc_mocha,
sourcename:v1env_mocha-1233456f789getc_mocha,
sourcetype:k8s,
tag:mine-2,v1env

See here for an in-depth discussion about using an enrichment filter for DataDog.

Required Parameters

name (required)

The name parameter specifies the name for the filter. You refer to this name in other places, for example to refer to a specific filter in a workflow or processor. Names must be unique within the filters: section. It is a yaml list element so it begins with a - and a space followed by the string. A name is a required parameter for a filter

filters:
  - name: <filter-name>

type: enrichment (required)

The type parameter in the filter context specifies the type of filter to apply. A type is a required parameter for a filter.

filters:
  - name: <filter-name>
    type: <filter-type>

Enrichment Type (required)

If you specify the filter type as enrichment, you must specify an enrichment type. An enrichment type is a required parameter for enrichment filters. You can specify one of the following:

  • dynamic
  • failure_behaviour
  • from_k8s
  • from_logs
  • from_path
filters:
  - name: <filter-name>
    type: enrichment
    <enrichment-type>:

field_mappings (required)

The field_mappings parameters define the fields that will be enriched by an enrichment filter. The child parameters depend on the type of enrichment filter being configured. All field_mappings have a - field_name child parameter and a further parameter such as a value, regex, path, or transformer that defines the enrichment for that field. A field_mappings parameter is required for an enrichment filter.

filters:
  - name: <filter-name>
    type: enrichment
    <enrichment-type>:
      field_mappings:
        - field_name: <field-name>

field_name (required)

The field_name parameter defines the log field that will be enriched either by updating the value if the field already exists in the log, or by adding the field if it doesnt already exist. It is specified as a string. The field_name parameter is required for an enrichment filter.

filters:
  - name: <filter-name>
    type: enrichment
    <enrichment-type>:
      field_mappings:
        - field_name: <field-name>

Optional Parameters

fallback_value

The fallback_value defines a default value to use if enrichment for the field_name fails. It can be used in from_log and dynamic enrichment filters. It is specified as a string.

filters:
  - name: <filter-name>
    type: enrichment
    <enrichment-type>:
      field_mappings:
        - field_name: <field-name>
          pattern|value|json_path: <enrichment_value>
          fallback_value: <default_value>

json_path

The json_path parameter is used to specify a value from a specific location in a JSON document in dynamic extraction and from_log extraction filters. This value will be applied to the object extracted from the field_name field. It is written in JSONPath format. The json_path parameter is optional in enrichment filters.

filters:
  - name: <filter-name>
    type: enrichment
    <enrichment-type>:
      field_mappings:
        - field_name: <field-name>
          json_path: <JSONPath>

pattern

The pattern parameter defines a Golang regex query for selecting data. In the context of an enrichment filter it is used as a value to apply to the field identified in the field_name. It can be used in a from_logs or from_path enrichment filter. It must be a capture pattern and only one capture group can be specified. The pattern parameter is written in Golang regex format and it is optional.

filters:
  - name: <filter-name>
    type: enrichment
    <from_logs|from_path>:
      field_mappings:
        - field_name: <field-name>
          pattern: <regex_pattern_for_enrichment_value>

pod_attribute

The pod_attribute parameter is used in a from_k8s enrichment filter to further constrain the field mapping to a particular Kubernetes object type. It can have the value pod, namespace, node or a label. If the K8s object name is a label it should have the label. prefix. It is an optional parameter.

filters:
  - name: <filter-name>
    type: enrichment
    from_k8s:
      pod_identifier_pattern: <pattern>
      field_mappings:
        - field_name: <field-name>
          pod_attribute: <<label.>object-type>

pod_identifier_pattern

The pod_identifier_pattern constrains the enrichment filter to only the logs that match the directory location of the logs - the destination they are being written to as specified in the kubelet and container runtime configuration. It is specified as a Golang regex pattern that contains a directory path. The pod_identifier_pattern is an optional parameter.

filters:
  - name: <filter-name>
    type: enrichment
    from_k8s:
      pod_identifier_pattern: <regex-path-pattern>

transformers

The transformers parameter specifies the input, process, and output in an enrichment filter that edits a specified field_name. It is a dictionary type with child parameters:

  • source specifies the data to change.
  • target defines the desired output.
  • type specifies the process required to derive the output. Type can either be replace to find the source instances and replace them with the target, or regex, indicating that the source is a regex pattern and all matches should be replaced with the target value.
filters:
  - name: <filter-name>
    type: enrichment
    from_k8s:
      pod_identifier_pattern: <pattern>
      field_mappings:
        - field_name: <field-name>
          pod_attribute: <<label.>object-type>
          transformers:
            - source: "<find-string>"
              target: "<replace-string>"
              type: "<regex|replace>"

value

The value parameter is used in a dynamic enrichment filter to specify the formula for creating a derived value for the field_name. It is specified in Go text template format, and it references other existing field names. Alternatively, it can specify a static value string for the field.

filters:
  - name: <filter-name>
    type: enrichment
    dynamic:
      field_mappings:
      - field_name: <field-name>
        value: "<Go-text-template>"
      - field_name: <static-field-name>
        value: "<static-value>"