Edge Delta Sample Processor
2 minute read
Overview
The Sample Processor node enables you to filter and manage Logs and Traces based on a given sampling type and percentage, using consistent probabilistic sampling. It works by letting a specified percentage of data pass through based on various fields, and provides additional configurable options to suit different criteria for sampling.
Note: The Sample Processor node applies sampling to Logs and Traces only, and passes through all other data types.
-
incoming_data_types: archive, cluster_pattern_and_sample, custom, datadog_payload, diagnostic, health, heartbeat, log, metric, signal, source, source_samples, splunk_payload, trace
-
outgoing_data_types: archive, cluster_pattern_and_sample, custom, datadog_payload, diagnostic, health, heartbeat, log, metric, signal, source, source_samples, splunk_payload, trace
Example Configuration
nodes:
- name: sampler
type: sample
percentage: 10
field_paths:
- item["attributes"]["foo"]
pass_through_on_failure: true
priority_field: item["attributes"]["priority"]
timestamp_granularity: "1s"
Required Parameters
name
A descriptive name for the node. This is the name that will appear in Visual Pipelines and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a -
and a space followed by the string. It is a required parameter for all nodes.
nodes:
- name: <node name>
type: <node type>
type: sample
The type
parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.
nodes:
- name: <node name>
type: <node type>
percentage
This parameter specifies the percentage of items that will be allowed to pass through the node without filtering. It is a required integer parameter.
nodes:
- name: sampler
type: sample
percentage: 10
pass_through_on_failure: true
pass_through_on_failure
This boolean parameter determines whether items should pass through if an error occurs during the evaluation of sampling. It is required and defaults to true
.
nodes:
- name: sampler
type: sample
percentage: 10
pass_through_on_failure: true
Optional Parameters
field_paths
List the paths to fields used for determining how sampling should occur. If not specified, traces are sampled by trace ID and logs by timestamp, service name, and body.
nodes:
- name: sampler
type: sample
percentage: 10
pass_through_on_failure: true
field_paths:
- item["attributes"]["foo"]
priority_field
Defines a field whose presence will override the default sampling percentage if the field has a value. This value is optional.
nodes:
- name: sampler
type: sample
percentage: 10
pass_through_on_failure: true
priority_field: item["attributes"]["priority"]
timestamp_granularity
This duration parameter specifies the granularity of timestamps when sampling by timestamp, with a minimum allowed granularity of 1 millisecond. It’s optional.
nodes:
- name: sampler
type: sample
percentage: 10
pass_through_on_failure: true
timestamp_granularity: "1s"