Edge Delta Thresholds

Define alerting conditions at the agent level.

Overview

A threshold defines alerting conditions at the agent level. Each agent locally evaluates the thresholds and then triggers an alert if the threshold is met. Alert destinations, such as Slack, PagerDuty, and email, can be added to the same workflow to receive these alerts.

There are two ways to define a threshold:

Processor-level thresholds

Most processors support the trigger_thresholds parameter to define thresholds.

Workflow-level thresholds

Using a threshold in a workflow is a more flexible way to define thresholds with various operators and regex-based metric name matching.

Configuring a Threshold

See the instructions for configuring an agent.

Example

thresholds:
  - name: http-latencyp95-threshold
    metric_name_pattern: http_request_method_.*_latency\.p95  
    operator: ">"
    value: 120
  - name: http-avg-threshold
    metric_name: http_request_method_getconfig_latency.avg    
    operator: ">="
    value: 50
  - name: cluster-errors-threshold
    metric_name: error.anomaly1
    operator: ">"
    value: 80
  - name: incoming-lines-threshold
    metric_name: incoming_lines.anomaly1
    operator: ">"
    value: 90
  - name: incoming-bytes-threshold
    metric_name: incoming_bytes.anomaly2
    operator: ">"
    value: 90
  - name: consecutive-bytes-threshold
    metric_name: incoming_bytes.anomaly2
    operator: ">"
    value: 90
    consecutive: 5

Multi-Condition Threshold

 - name: cluster-errors-multi-threshold
    type: and
    interval: 1m
    conditions: 
    - metric_name: http_request_method_updateconfig_latency.avg
      operator: ">="
      value: 100
    - metric_name: http_request_method_deleteconfig_latency.max
      operator: ">"
      value: 125
      consecutive: 5 

Parameters

name

Required

Enter a descriptive name for the threshold, which will be used to map this threshold to a workflow.

name: consecutive-bytes-threshold

type: and

Optional

This parameter only applies to thresholds with multiple conditions.

Enter and to use multiple conditions within a single threshold.

type: and

interval

Optional

This parameter only applies to thresholds with multiple conditions.

Enter a length of time to flush conditional states (triggered and not triggered ).

interval: 1m

metric_name

Optional

This parameter is the exact name of the metric to be evaluated. Metric names are generated based on processor names.

You must enter a metric_name or metric_name_pattern, but not both.

metric_name: incoming_lines.anomaly1

metric_name_pattern

Optional

This parameter is the regular expression that will be used to match the metric names.

You must enter a metric_name or metric_name_pattern, but not both.

metric_name_pattern: http_request_method_.*_latency\.p95

operator

Optional

This parameter supports the following operators:

  • ==
  • >
  • >=
  • <
  • <=
operator: ">"

value

Optional

This parameter is the threshold value used to compare with the metric value, based on the specified operator.

value: 90

consecutive

Required

This parameter is the number of times in a row that a threshold condition must be met to trigger an alert.

For example, the default value is 0, which means that any threshold condition met will cause an alert to trigger.

consecutive: 5

conditions

Optional

This parameter only applies to thresholds with multiple conditions. Specifically, this parameter allows you to add multiple conditions to a single threshold:

  • metric_name (or metric_name_pattern)
  • operator
  • value
  • consecutive
conditions:
    - metric_name: http_request_method_updateconfig_latency.avg
      operator: ">="
      value: 100
    - metric_name: http_request_method_deleteconfig_latency.max
      operator: ">"
      value: 125
      consecutive: 5