Edge Delta Source Detection Filter
The Edge Delta agent processes data on a per-source level of granularity. For example, logs from a single input configuration for a K8s namespace would come from multiple sources - one per pod-container pair. The agent is normally aware of these sources and it calculates metrics and identifies patterns accordingly on a per-container of pod basis. However, some inputs look like individual sources but are in fact multiple or composite sources. To avoid this issue, you can configure a source detection filter to assign logs to sources using field mappings. After being configured in the filters section of the agent yaml, source detection filters are referenced in input definitions.
There are a few scenarios where you might need to configure a source detection filter:
- Kafka is configured to stream logs that it has collected from multiple pods. You want the agent to determine which logs belong to which pods in the existing internal K8s source definition.
- A file source contains logs from multiple K8s pod or containers from multiple namespaces. You want the agent to make calculations per pod or per workload.
- A port source ingests logs from several applications. You want the agent to consider each application to be a unique source.
Field mappings are the way you configure which logs are assigned to which sources. They are defined as key value pairs where the key represents the existing internal source definition, and the value represents a field in the log that identifies the source.
Source Detection Types
Source detection configuration depends on the source type. There are built in source detection types you can use with predefined keys for the field mappings.
|Source Type||Field Mapping Keys|
|Docker||- docker_container_id, docker_container_image (Mandatory)
- docker_container_name, docker_image_name (Optional)
|ECS||- ecs_container_id, ecs_container_image, ecs_container_name (Mandatory)
- ecs_cluster, ecs_container, ecs_task_family, ecs_task_version (Optional)
|File||- file_path, file_glob_path (Mandatory)|
|K8s||- k8s_namespace, k8s_pod_name, k8s_container_name, k8s_container_image (Mandatory)
- k8s_logfile_path, k8s_controller_kind, k8s_controller_name, k8s_controller_logical_name, k8s_pod_id, k8s_docker_id (Optional)
Not defining the optional keys can cause loss of information.
Instead of defining a path for a mandatory key, you can use a dash to skip a field to decrease cardinality.
If the source is not one of these types, you can use a custom source type with any field mapping keys.
There are three types of processing you can configure for a source detection filter. This configures the nature of the field mappings that the filter will apply:
regex- The field mapping value will be configured as a regex pattern with one capture group named “field”.
json- The field mapping value will be configured as a JSON path.
attribute- Both regex and JSON modes consume data from the log message itself, and assume the fields you need (i.e. namespace, container, etc. exist as strings in the log payload). In attribute mode, the field mapping value will use keys from the attribute provider for the parent source. For this to work, either the source should have source-level enrichments or a source-detection filter should be configured to execute after enrichment filters.
Optional Source Detection
You can configure source detection to be optional or mandatory. If it is optional, (
optional: true) logs where source detection failed will still be ingested. If source detection is mandatory (
optional: false), which is the default setting, logs will only be ingested if the source can be determined.
Source Detection Filter Examples
This example filter uses K8s field mappings. It is optional and it uses JSON paths to define field mapping values. In this example, logs with kubernetes.namespace field values will be mapped to a k8s_namespace field. The k8s_pod_name field is skipped to reduce cardinality and there are field mappings for other K8s fields.
filters: - name: source-detection-k8s type: source-detection source_type: "K8s" optional: true processing_mode: json field_mappings: k8s_namespace: "kubernetes.namespace" k8s_pod_name: "-" k8s_container_name: "kubernetes.container.name" k8s_container_image: "kubernetes.container.image" k8s_controller_logical_name: "kubernetes.controller.name"
Kubernetes Source in Attribute Mode
The following example uses attribute mode to derive the keys from the source attributes rather than the log message.
filters: - name: source-detector-attribute type: source-detection source_type: "K8s" processing_mode: attribute optional: true field_mappings: k8s_namespace: "k8s_namespace" k8s_pod_name: "k8s_pod_name" k8s_container_name: "k8s_container_name" k8s_container_image: "k8s_container_image"
This example filter is mandatory, so any logs that do not resolve source detection will not be ingested. It maps the value in the docker.id path to the docker_container_id field, and the docker.image value to the docker_container_image field.
filters: - name: source-detection-docker type: source-detection source_type: "Docker" optional: false field_mappings: docker_container_id: "docker.id" docker_container_image: "docker.image"
This example filter has mandatory source detection with regex pattern field mappings that define custom keys such as namespace, serviceName etc.
- name: source-detection-custom type: source-detection source_type: "Custom" optional: false processing_mode: regex field_mappings: namespace: namespace (?P<field>\w+) serviceName: service (?P<field>\w+) roleName: user_role (?P<field>\w+) systemType: system (?P<field>\w+)
Implementation in an Input Definition
After defining a source detection filter, it can be used to configure an input. In this example, the source-detection-custom filter is applied in an ed_ports input.
inputs: ed_ports: - labels: "app" path: "/var/log/myapps/*.log" filters: - source-detection-custom
Source Detection Filter Parameters
name parameter specifies the name for the filter. You refer to this name in other places, for example to refer to a specific filter in a workflow or processor. Names must be unique within the
filters: section. It is a yaml list element so it begins with a
- and a space followed by the string. A name is a required parameter for a filter
filters: - name: <filter-name>
type: source-detection (Required)
type parameter in the filter context specifies the type of filter to apply. A type is a required parameter for a filter.
filters: - name: <filter-name> type: <filter-type>
source_type parameter specifies the type of source being detected, which in turn specifies the field mapping keys. It can be one of the following values:
source_type parameter is required for a source detection filter.
filters: - name: <filter-name> type: <filter-type> source_type: "Docker|ECS|File|K8s|Custom"
field_mappings parameter defines the fields that will be used to determine the source in a source detection filter. All
field_mappings are key:value pairs where the key defines the existing source definition or a custom field while the value is a regex pattern or JSON path for a log field to be mapped to the key. A
field_mappings parameter is required for a source detection filter.
filters: - name: <filter-name> type: <filter-type> source_type: "Docker|ECS|File|K8s|Custom" processing_mode: regex|json|attribute field_mappings: <source-definition>:<log-field>
optional parameter specifies whether source detection is mandatory for ingesting the log. It is specified as a boolean value with
optional parameter is not mandatory for a source detection filter and
optional: false is the default behaviour.
filters: - name: <filter-name> type: <filter-type> source_type: "Docker|ECS|File|K8s|Custom" optional: true|false
processing_mode parameter specifies the notation used to define the field mapping value. It can be one of the following values:
processing_mode parameter is optional for a source detection filter.
filters: - name: <filter-name> type: <filter-type> source_type: "Docker|ECS|File|K8s|Custom" processing_mode: regex|json|attribute