Edge Delta Tail Sample Processor

The Tail Sample Processor samples incoming traces and spans based on predefined sampling policies.

Overview

The Tail Sample processor enables you to filter and manage traces and spans by applying various sampling policies.

It is best used in gateway pipelines: In distributed systems where traces are collected from multiple edge clusters, spans belonging to the same parent trace can originate from different edges. Each edge may contribute spans that represent various components or microservices involved in executing a particular transaction or operation. By deploying the Tail Sample Processor on a gateway pipeline, you ensure that these spans are aggregated efficiently, providing a comprehensive view of the entire trace while applying uniform sampling policies.

For detailed instructions on how to use multiprocessors, see Use Multiprocessors.

Configuration

nodes:
  - name: tail_sampling
    type: tail_sample
    sampling_policies:
      - name: 1s latency or higher
        policy_type: latency
        lower_threshold: 1s   

Options

Select a telemetry type

Trace is selected by default.

Condition

The condition parameter contains a conditional phrase of an OTTL statement. It restricts operation of the processor to only data items where the condition is met. Those data items that do not match the condition are passed without processing. You configure it in the interface and an OTTL condition is generated. It is optional. You can select one of the following operators:

Operator Name Description Example
== Equal to Returns true if both values are exactly the same attributes["status"] == "OK"
!= Not equal to Returns true if the values are not the same attributes["level"] != "debug"
> Greater than Returns true if the left value is greater than the right attributes["duration_ms"] > 1000
>= Greater than or equal Returns true if the left value is greater than or equal to the right attributes["score"] >= 90
< Less than Returns true if the left value is less than the right attributes["load"] < 0.75
<= Less than or equal Returns true if the left value is less than or equal to the right attributes["retries"] <= 3
matches Regex match Returns true if the string matches a regular expression isMatch(attributes["name"], ".*\\.name$"

It is defined in YAML as follows:

- name: _multiprocessor
  type: sequence
  processors:
  - type: <processor type>
    condition: attributes["request"]["path"] == "/json/view"

Sampling Policies

You define one or more policies, any of which can trigger sampling (OR logic). Enter a Policy Name to distinguish this policy from others. Next choose a Policy Type, each tailored for specific sampling criteria and conditions:

      - name: tail_sampling
        type: tail_sample
        sampling_policies:
          - name: policy_1
            ...

Probabilistic

  • Percentage: Specifies the probability of a trace being sampled. A value of 100 means that all traces meeting the criteria are sampled, while 0 means none are sampled.
  • Hash Salt: A string used to calculate the hash for sampling evaluation.
      - name: policy_1
        policy_type: probabilistic
        hash_salt: some_salt
        percentage: 50

Latency

  • Lower Threshold: Defines the minimum latency duration a trace must have to be considered for sampling.
  • Upper Threshold: Sets the maximum latency duration for trace eligibility.
      - name: policy_2
        policy_type: latency
        lower_threshold: 10s
        upper_threshold: 30s

Status Code

  • Status Codes: A list of acceptable status codes. A trace is sampled if it contains at least one span with a status code from this list. Common status codes include OK, ERROR, and UNSET.
      - name: policy_3
        policy_type: status_code
        status_codes:
          - OK
          - ERROR

Span Count

  • Minimum Span Count: The least number of spans a trace must contain for it to be eligible for sampling.
  • Maximum Span Count: The maximum allowable number of spans in a trace for it to qualify.
      - name: policy_4
        policy_type: span_count
        min_span_count: 10
        max_span_count: 20

Condition

  • Conditions: Logical conditions that a trace must satisfy for it to be sampled. One span meeting this condition is sufficient to sample the entire trace.

As with the processor level condition, you can select a field path, a value, and one of the following operators:

Operator Name Description Example
== Equal to Returns true if both values are exactly the same attributes["status"] == "OK"
!= Not equal to Returns true if the values are not the same attributes["level"] != "debug"
> Greater than Returns true if the left value is greater than the right attributes["duration_ms"] > 1000
>= Greater than or equal Returns true if the left value is greater than or equal to the right attributes["score"] >= 90
< Less than Returns true if the left value is less than the right attributes["load"] < 0.75
<= Less than or equal Returns true if the left value is less than or equal to the right attributes["retries"] <= 3
matches Regex match Returns true if the string matches a regular expression isMatch(attributes["name"], ".*\\.name$"
      - name: policy_5
        policy_type: condition
        conditions:
          - attributes["foo"] == "bar"
          - resource["service.name"] == "test-service"

Numeric Attribute

  • Attribute Key: The key associated with a numerical value to evaluate.
  • Minimum Value: The smallest acceptable value for this attribute.
  • Maximum Value: The highest value for this attribute, defining a range.
      - name: policy_6
        policy_type: numeric_attribute
        key: int_attribute
        min_value: 5
        max_value: 10

String Attribute

  • Attribute Key: The key linked to a string value.
  • Values: List of valid string values a trace may hold to qualify for sampling.
  • Support Regex: Enables regex matching against listed values for more flexible criteria.
  • Regex Cache Size: The size of the cache used to store regex evaluation results.
      - name: policy_7
        policy_type: string_attribute
        key: some_attribute
        values:
          - please
          - match
          - this
      - name: policy_8
        policy_type: string_attribute
        key: some_other_attribute
        support_regex: true
        regex_cache_size: 200
        values:
          - .*foo.*
          - .*bar.*

Boolean Attribute

Ensure the presence or absence of specific attribute conditions:

  • Attribute Key: The key path associated with a boolean value.
  • Value: The boolean value to match for sampling.
      - name: policy_9
        policy_type: boolean_attribute
        key: bool_attribute
        value: true

And

This option enables nesting of policies with logical operator AND, requiring both policies to be met. You define a collection of policies where all must be satisfied for a trace to be sampled.

      - name: policy_10
        policy_type: and
        sub_policies:
          - name: sub_policy_1
            policy_type: probabilistic
            percentage: 20
          - name: sub_policy_2
            policy_type: string_attribute
            key: key_a
            values:
              - a
              - b

Drop

Similar to And, but focused on identifying traces to be excluded explicitly

      - name: policy_11
        policy_type: drop
        sub_policies:
          - name: sub_policy_1
            policy_type: latency
            lower_threshold: 20s
            upper_threshold: 40s
          - name: sub_policy_2
            policy_type: condition
            conditions:
              - resource["service.name"] == "some-service"

Advanced Settings

Decision Interval

Interval to decide whether spans with same trace ID should be sampled or not. Timer starts after first trace item is observed by the processor.

  - name: tail_sampling
    type: tail_sample
    decision_interval: 30s

Cache Batch Size

Configured with a default size of 50,000, it temporarily holds trace IDs with span and decision data before sampling decisions are made.

  - name: tail_sampling
    type: tail_sample
    cache:
      batch:
        size: 50000

Keep Cache

Holds trace IDs with keep decision outcomes for late-arriving traces, default size is 20,000.

  - name: tail_sampling
    type: tail_sample
    cache:
      keep:
        size: 20000

Drop Cache

Stores drop decision outcomes, default size is 100,000, ensuring consistent handling of late-arriving traces.

  - name: tail_sampling
    type: tail_sample
    cache:
      drop:
        size: 100000

Final

The final parameter specifies whether successfully processed data items should continue to subsequent processors within the same multiprocessor node. Data items that fail to be processed by the processor will be passed to the next processor in the node regardless of this setting. You select the slider in the tool which specifies it for you in the YAML as a Boolean. The default is false and it is optional.

It is defined in YAML as follows:

- name: multiprocessor
  type: sequence
  processors:
    - type: <processor type>
    final: true