Edge Delta Source Detection Filter
6 minute read
Overview
The Edge Delta agent processes data on a per-source level of granularity. For example, logs from a single input configuration for a K8s namespace would come from multiple sources - one per pod-container pair. The agent is normally aware of these sources and it calculates metrics and identifies patterns accordingly on a per-container of pod basis. However, some inputs look like individual sources but are in fact multiple or composite sources. To avoid this issue, you can configure a source detection filter to assign logs to sources using field mappings. After being configured in the filters section of the agent yaml, source detection filters are referenced in input definitions.
Example Scenarios
There are a few scenarios where you might need to configure a source detection filter:
- Kafka is configured to stream logs that it has collected from multiple pods. You want the agent to determine which logs belong to which pods in the existing internal K8s source definition.
- A file source contains logs from multiple K8s pod or containers from multiple namespaces. You want the agent to make calculations per pod or per workload.
- A port source ingests logs from several applications. You want the agent to consider each application to be a unique source.
Field Mappings
Field mappings are the way you configure which logs are assigned to which sources. They are defined as key value pairs where the key represents the existing internal source definition, and the value represents a field in the log that identifies the source.
Source Detection Types
Source detection configuration depends on the source type. There are built in source detection types you can use with predefined keys for the field mappings.
Source Type | Field Mapping Keys |
---|---|
Docker | - docker_container_id, docker_container_image (Mandatory) - docker_container_name, docker_image_name (Optional) |
ECS | - ecs_container_id, ecs_container_image, ecs_container_name (Mandatory) - ecs_cluster, ecs_container, ecs_task_family, ecs_task_version (Optional) |
File | - file_path, file_glob_path (Mandatory) |
K8s | - k8s_namespace, k8s_pod_name, k8s_container_name, k8s_container_image (Mandatory) - k8s_logfile_path, k8s_controller_kind, k8s_controller_name, k8s_controller_logical_name, k8s_pod_id, k8s_docker_id (Optional) |
Not defining the optional keys can cause loss of information.
Instead of defining a path for a mandatory key, you can use a dash to skip a field to decrease cardinality.
If the source is not one of these types, you can use a custom source type with any field mapping keys.
Processing Modes
There are three types of processing you can configure for a source detection filter. This configures the nature of the field mappings that the filter will apply:
regex
- The field mapping value will be configured as a regex pattern with one capture group named “field”.json
- The field mapping value will be configured as a JSON path.attribute
- Both regex and JSON modes consume data from the log message itself, and assume the fields you need (i.e. namespace, container, etc. exist as strings in the log payload). In attribute mode, the field mapping value will use keys from the attribute provider for the parent source. For this to work, either the source should have source-level enrichments or a source-detection filter should be configured to execute after enrichment filters.
Optional Source Detection
You can configure source detection to be optional or mandatory. If it is optional, (optional: true
) logs where source detection failed will still be ingested. If source detection is mandatory (optional: false
), which is the default setting, logs will only be ingested if the source can be determined.
Source Detection Filter Examples
Kubernetes
This example filter uses K8s field mappings. It is optional and it uses JSON paths to define field mapping values. In this example, logs with kubernetes.namespace field values will be mapped to a k8s_namespace field. The k8s_pod_name field is skipped to reduce cardinality and there are field mappings for other K8s fields.
filters:
- name: source-detection-k8s
type: source-detection
source_type: "K8s"
optional: true
processing_mode: json
field_mappings:
k8s_namespace: "kubernetes.namespace"
k8s_pod_name: "-"
k8s_container_name: "kubernetes.container.name"
k8s_container_image: "kubernetes.container.image"
k8s_controller_logical_name: "kubernetes.controller.name"
Kubernetes Source in Attribute Mode
The following example uses attribute mode to derive the keys from the source attributes rather than the log message.
filters:
- name: source-detector-attribute
type: source-detection
source_type: "K8s"
processing_mode: attribute
optional: true
field_mappings:
k8s_namespace: "k8s_namespace"
k8s_pod_name: "k8s_pod_name"
k8s_container_name: "k8s_container_name"
k8s_container_image: "k8s_container_image"
Docker
This example filter is mandatory, so any logs that do not resolve source detection will not be ingested. It maps the value in the docker.id path to the docker_container_id field, and the docker.image value to the docker_container_image field.
filters:
- name: source-detection-docker
type: source-detection
source_type: "Docker"
optional: false
field_mappings:
docker_container_id: "docker.id"
docker_container_image: "docker.image"
Custom
This example filter has mandatory source detection with regex pattern field mappings that define custom keys such as namespace, serviceName etc.
- name: source-detection-custom
type: source-detection
source_type: "Custom"
optional: false
processing_mode: regex
field_mappings:
namespace: namespace (?P<field>\w+)
serviceName: service (?P<field>\w+)
roleName: user_role (?P<field>\w+)
systemType: system (?P<field>\w+)
Implementation in an Input Definition
After defining a source detection filter, it can be used to configure an input. In this example, the source-detection-custom filter is applied in an ed_ports input.
inputs:
ed_ports:
- labels: "app"
path: "/var/log/myapps/*.log"
filters:
- source-detection-custom
Source Detection Filter Parameters
Required Parameters
name (Required)
The name
parameter specifies the name for the filter. You refer to this name in other places, for example to refer to a specific filter in a workflow or processor. Names must be unique within the filters:
section. It is a YAML list element so it begins with a -
and a space followed by the string. A name is a required parameter for a filter
filters:
- name: <filter-name>
type: source-detection (Required)
The type
parameter in the filter context specifies the type of filter to apply. A type is a required parameter for a filter.
filters:
- name: <filter-name>
type: <filter-type>
source_type (Required)
The source_type
parameter specifies the type of source being detected, which in turn specifies the field mapping keys. It can be one of the following values:
Docker
ECS
File
K8s
Custom
The source_type
parameter is required for a source detection filter.
filters:
- name: <filter-name>
type: <filter-type>
source_type: "Docker|ECS|File|K8s|Custom"
field_mappings (Required)
The field_mappings
parameter defines the fields that will be used to determine the source in a source detection filter. All field_mappings
are key:value pairs where the key defines the existing source definition or a custom field while the value is a Golang regex pattern or JSON path for a log field to be mapped to the key. A field_mappings
parameter is required for a source detection filter.
filters:
- name: <filter-name>
type: <filter-type>
source_type: "Docker|ECS|File|K8s|Custom"
processing_mode: regex|json|attribute
field_mappings:
<source-definition>:<log-field>
Optional Parameters
optional
The optional
parameter specifies whether source detection is mandatory for ingesting the log. It is specified as a boolean value with true
or false
. The optional
parameter is not mandatory for a source detection filter and optional: false
is the default behavior.
filters:
- name: <filter-name>
type: <filter-type>
source_type: "Docker|ECS|File|K8s|Custom"
optional: true|false
processing_mode
The processing_mode
parameter specifies the notation used to define the field mapping value. It can be one of the following values:
regex
json
attribute
The processing_mode
parameter is optional for a source detection filter.
filters:
- name: <filter-name>
type: <filter-type>
source_type: "Docker|ECS|File|K8s|Custom"
processing_mode: regex|json|attribute