Edge Delta Source Detection Filter

Assign logs to sources using field mappings..

Overview

The Edge Delta agent processes data on a per-source level of granularity. For example, logs from a single input configuration for a K8s namespace would come from multiple sources - one per pod-container pair. The agent is normally aware of these sources and it calculates metrics and identifies patterns accordingly on a per-container of pod basis. However, some inputs look like individual sources but are in fact multiple or composite sources. To avoid this issue, you can configure a source detection filter to assign logs to sources using field mappings. After being configured in the filters section of the agent yaml, source detection filters are referenced in input definitions.

Example Scenarios

There are a few scenarios where you might need to configure a source detection filter:

  • Kafka is configured to stream logs that it has collected from multiple pods. You want the agent to determine which logs belong to which pods in the existing internal K8s source definition.
  • A file source contains logs from multiple K8s pod or containers from multiple namespaces. You want the agent to make calculations per pod or per workload.
  • A port source ingests logs from several applications. You want the agent to consider each application to be a unique source.

Field Mappings

Field mappings are the way you configure which logs are assigned to which sources. They are defined as key value pairs where the key represents the existing internal source definition, and the value represents a field in the log that identifies the source.

Source Detection Types

Source detection configuration depends on the source type. There are built in source detection types you can use with predefined keys for the field mappings.

Source Type Field Mapping Keys
Docker - docker_container_id, docker_container_image (Mandatory)
- docker_container_name, docker_image_name (Optional)
ECS - ecs_container_id, ecs_container_image, ecs_container_name (Mandatory)
- ecs_cluster, ecs_container, ecs_task_family, ecs_task_version (Optional)
File - file_path, file_glob_path (Mandatory)
K8s - k8s_namespace, k8s_pod_name, k8s_container_name, k8s_container_image (Mandatory)
- k8s_logfile_path, k8s_controller_kind, k8s_controller_name, k8s_controller_logical_name, k8s_pod_id, k8s_docker_id (Optional)

Not defining the optional keys can cause loss of information.

Instead of defining a path for a mandatory key, you can use a dash to skip a field to decrease cardinality.

If the source is not one of these types, you can use a custom source type with any field mapping keys.

Processing Modes

There are three types of processing you can configure for a source detection filter. This configures the nature of the field mappings that the filter will apply:

  • regex - The field mapping value will be configured as a regex pattern with one capture group named “field”.
  • json - The field mapping value will be configured as a JSON path.
  • attribute - Both regex and JSON modes consume data from the log message itself, and assume the fields you need (i.e. namespace, container, etc. exist as strings in the log payload). In attribute mode, the field mapping value will use keys from the attribute provider for the parent source. For this to work, either the source should have source-level enrichments or a source-detection filter should be configured to execute after enrichment filters.

Optional Source Detection

You can configure source detection to be optional or mandatory. If it is optional, (optional: true) logs where source detection failed will still be ingested. If source detection is mandatory (optional: false), which is the default setting, logs will only be ingested if the source can be determined.

Source Detection Filter Examples

Kubernetes

This example filter uses K8s field mappings. It is optional and it uses JSON paths to define field mapping values. In this example, logs with kubernetes.namespace field values will be mapped to a k8s_namespace field. The k8s_pod_name field is skipped to reduce cardinality and there are field mappings for other K8s fields.

filters:
  - name: source-detection-k8s
    type: source-detection
    source_type: "K8s" 
    optional: true
    processing_mode: json
    field_mappings:
      k8s_namespace: "kubernetes.namespace"
      k8s_pod_name: "-"
      k8s_container_name: "kubernetes.container.name"
      k8s_container_image: "kubernetes.container.image"
      k8s_controller_logical_name: "kubernetes.controller.name"

Kubernetes Source in Attribute Mode

The following example uses attribute mode to derive the keys from the source attributes rather than the log message.

filters:
  - name: source-detector-attribute
    type: source-detection
    source_type: "K8s"
    processing_mode: attribute
    optional: true
    field_mappings:
      k8s_namespace: "k8s_namespace"
      k8s_pod_name: "k8s_pod_name"
      k8s_container_name: "k8s_container_name"
      k8s_container_image: "k8s_container_image"

Docker

This example filter is mandatory, so any logs that do not resolve source detection will not be ingested. It maps the value in the docker.id path to the docker_container_id field, and the docker.image value to the docker_container_image field.

filters:
  - name: source-detection-docker
    type: source-detection
    source_type: "Docker"
    optional: false
    field_mappings:
      docker_container_id: "docker.id"
      docker_container_image: "docker.image"

Custom

This example filter has mandatory source detection with regex pattern field mappings that define custom keys such as namespace, serviceName etc.

  - name: source-detection-custom
    type: source-detection
    source_type: "Custom"
    optional: false
    processing_mode: regex
    field_mappings:
      namespace: namespace (?P<field>\w+)
      serviceName: service (?P<field>\w+)
      roleName: user_role (?P<field>\w+)
      systemType: system (?P<field>\w+)

Implementation in an Input Definition

After defining a source detection filter, it can be used to configure an input. In this example, the source-detection-custom filter is applied in an ed_ports input.

inputs:
  ed_ports:
    - labels: "app"
      path: "/var/log/myapps/*.log"
      filters:
      - source-detection-custom

Source Detection Filter Parameters

Required Parameters

name (Required)

The name parameter specifies the name for the filter. You refer to this name in other places, for example to refer to a specific filter in a workflow or processor. Names must be unique within the filters: section. It is a YAML list element so it begins with a - and a space followed by the string. A name is a required parameter for a filter

filters:
  - name: <filter-name>

type: source-detection (Required)

The type parameter in the filter context specifies the type of filter to apply. A type is a required parameter for a filter.

filters:
  - name: <filter-name>
    type: <filter-type>

source_type (Required)

The source_type parameter specifies the type of source being detected, which in turn specifies the field mapping keys. It can be one of the following values:

  • Docker
  • ECS
  • File
  • K8s
  • Custom

The source_type parameter is required for a source detection filter.

filters:
  - name: <filter-name>
    type: <filter-type>
    source_type: "Docker|ECS|File|K8s|Custom"

field_mappings (Required)

The field_mappings parameter defines the fields that will be used to determine the source in a source detection filter. All field_mappings are key:value pairs where the key defines the existing source definition or a custom field while the value is a Golang regex pattern or JSON path for a log field to be mapped to the key. A field_mappings parameter is required for a source detection filter.

filters:
  - name: <filter-name>
    type: <filter-type>
    source_type: "Docker|ECS|File|K8s|Custom"
    processing_mode: regex|json|attribute
    field_mappings:
      <source-definition>:<log-field>

Optional Parameters

optional

The optional parameter specifies whether source detection is mandatory for ingesting the log. It is specified as a boolean value with true or false. The optional parameter is not mandatory for a source detection filter and optional: false is the default behavior.

filters:
  - name: <filter-name>
    type: <filter-type>
    source_type: "Docker|ECS|File|K8s|Custom"
    optional: true|false

processing_mode

The processing_mode parameter specifies the notation used to define the field mapping value. It can be one of the following values:

  • regex
  • json
  • attribute

The processing_mode parameter is optional for a source detection filter.

filters:
  - name: <filter-name>
    type: <filter-type>
    source_type: "Docker|ECS|File|K8s|Custom"
    processing_mode: regex|json|attribute