Edge Delta Tail Sample Processor
6 minute read
Overview
The Tail Sample processor enables you to filter and manage traces and spans by applying various sampling policies.
It is best used in gateway pipelines: In distributed systems where traces are collected from multiple edge clusters, spans belonging to the same parent trace can originate from different edges. Each edge may contribute spans that represent various components or microservices involved in executing a particular transaction or operation. By deploying the Tail Sample Processor on a gateway pipeline, you ensure that these spans are aggregated efficiently, providing a comprehensive view of the entire trace while applying uniform sampling policies.
For detailed instructions on how to use multiprocessors, see Use Multiprocessors.
Configuration

nodes:
- name: tail_sampling
type: tail_sample
sampling_policies:
- name: 1s latency or higher
policy_type: latency
lower_threshold: 1s
Options
Select a telemetry type
Trace is selected by default.
Condition
The condition
parameter contains a conditional phrase of an OTTL statement. It restricts operation of the processor to only data items where the condition is met. Those data items that do not match the condition are passed without processing. You configure it in the interface and an OTTL condition is generated. It is optional. You can select one of the following operators:
Operator | Name | Description | Example |
---|---|---|---|
== |
Equal to | Returns true if both values are exactly the same |
attributes["status"] == "OK" |
!= |
Not equal to | Returns true if the values are not the same |
attributes["level"] != "debug" |
> |
Greater than | Returns true if the left value is greater than the right |
attributes["duration_ms"] > 1000 |
>= |
Greater than or equal | Returns true if the left value is greater than or equal to the right |
attributes["score"] >= 90 |
< |
Less than | Returns true if the left value is less than the right |
attributes["load"] < 0.75 |
<= |
Less than or equal | Returns true if the left value is less than or equal to the right |
attributes["retries"] <= 3 |
matches |
Regex match | Returns true if the string matches a regular expression |
isMatch(attributes["name"], ".*\\.name$" |
It is defined in YAML as follows:
- name: _multiprocessor
type: sequence
processors:
- type: <processor type>
condition: attributes["request"]["path"] == "/json/view"
Sampling Policies
You define one or more policies, any of which can trigger sampling (OR
logic). Enter a Policy Name to distinguish this policy from others. Next choose a Policy Type, each tailored for specific sampling criteria and conditions:
- name: tail_sampling
type: tail_sample
sampling_policies:
- name: policy_1
...
Probabilistic
- Percentage: Specifies the probability of a trace being sampled. A value of 100 means that all traces meeting the criteria are sampled, while 0 means none are sampled.
- Hash Salt: A string used to calculate the hash for sampling evaluation.
- name: policy_1
policy_type: probabilistic
hash_salt: some_salt
percentage: 50
Latency
- Lower Threshold: Defines the minimum latency duration a trace must have to be considered for sampling.
- Upper Threshold: Sets the maximum latency duration for trace eligibility.
- name: policy_2
policy_type: latency
lower_threshold: 10s
upper_threshold: 30s
Status Code
- Status Codes: A list of acceptable status codes. A trace is sampled if it contains at least one span with a status code from this list. Common status codes include
OK
,ERROR
, andUNSET
.
- name: policy_3
policy_type: status_code
status_codes:
- OK
- ERROR
Span Count
- Minimum Span Count: The least number of spans a trace must contain for it to be eligible for sampling.
- Maximum Span Count: The maximum allowable number of spans in a trace for it to qualify.
- name: policy_4
policy_type: span_count
min_span_count: 10
max_span_count: 20
Condition
- Conditions: Logical conditions that a trace must satisfy for it to be sampled. One span meeting this condition is sufficient to sample the entire trace.
As with the processor level condition, you can select a field path, a value, and one of the following operators:
Operator | Name | Description | Example |
---|---|---|---|
== |
Equal to | Returns true if both values are exactly the same |
attributes["status"] == "OK" |
!= |
Not equal to | Returns true if the values are not the same |
attributes["level"] != "debug" |
> |
Greater than | Returns true if the left value is greater than the right |
attributes["duration_ms"] > 1000 |
>= |
Greater than or equal | Returns true if the left value is greater than or equal to the right |
attributes["score"] >= 90 |
< |
Less than | Returns true if the left value is less than the right |
attributes["load"] < 0.75 |
<= |
Less than or equal | Returns true if the left value is less than or equal to the right |
attributes["retries"] <= 3 |
matches |
Regex match | Returns true if the string matches a regular expression |
isMatch(attributes["name"], ".*\\.name$" |
- name: policy_5
policy_type: condition
conditions:
- attributes["foo"] == "bar"
- resource["service.name"] == "test-service"
Numeric Attribute
- Attribute Key: The key associated with a numerical value to evaluate.
- Minimum Value: The smallest acceptable value for this attribute.
- Maximum Value: The highest value for this attribute, defining a range.
- name: policy_6
policy_type: numeric_attribute
key: int_attribute
min_value: 5
max_value: 10
String Attribute
- Attribute Key: The key linked to a string value.
- Values: List of valid string values a trace may hold to qualify for sampling.
- Support Regex: Enables regex matching against listed values for more flexible criteria.
- Regex Cache Size: The size of the cache used to store regex evaluation results.
- name: policy_7
policy_type: string_attribute
key: some_attribute
values:
- please
- match
- this
- name: policy_8
policy_type: string_attribute
key: some_other_attribute
support_regex: true
regex_cache_size: 200
values:
- .*foo.*
- .*bar.*
Boolean Attribute
Ensure the presence or absence of specific attribute conditions:
- Attribute Key: The key path associated with a boolean value.
- Value: The boolean value to match for sampling.
- name: policy_9
policy_type: boolean_attribute
key: bool_attribute
value: true
And
This option enables nesting of policies with logical operator AND
, requiring both policies to be met. You define a collection of policies where all must be satisfied for a trace to be sampled.
- name: policy_10
policy_type: and
sub_policies:
- name: sub_policy_1
policy_type: probabilistic
percentage: 20
- name: sub_policy_2
policy_type: string_attribute
key: key_a
values:
- a
- b
Drop
Similar to And
, but focused on identifying traces to be excluded explicitly
- name: policy_11
policy_type: drop
sub_policies:
- name: sub_policy_1
policy_type: latency
lower_threshold: 20s
upper_threshold: 40s
- name: sub_policy_2
policy_type: condition
conditions:
- resource["service.name"] == "some-service"
Advanced Settings
Decision Interval
Interval to decide whether spans with same trace ID should be sampled or not. Timer starts after first trace item is observed by the processor.
- name: tail_sampling
type: tail_sample
decision_interval: 30s
Cache Batch Size
Configured with a default size of 50,000
, it temporarily holds trace IDs with span and decision data before sampling decisions are made.
- name: tail_sampling
type: tail_sample
cache:
batch:
size: 50000
Keep Cache
Holds trace IDs with keep
decision outcomes for late-arriving traces, default size is 20,000
.
- name: tail_sampling
type: tail_sample
cache:
keep:
size: 20000
Drop Cache
Stores drop
decision outcomes, default size is 100,000
, ensuring consistent handling of late-arriving traces.
- name: tail_sampling
type: tail_sample
cache:
drop:
size: 100000
Final
The final
parameter specifies whether successfully processed data items should continue to subsequent processors within the same multiprocessor node. Data items that fail to be processed by the processor will be passed to the next processor in the node regardless of this setting. You select the slider in the tool which specifies it for you in the YAML as a Boolean. The default is false
and it is optional.
It is defined in YAML as follows:
- name: multiprocessor
type: sequence
processors:
- type: <processor type>
final: true