Monitors

Monitors in the Edge Delta back end trigger notifications based on event thresholds and provide status management across all pipelines.

  5 minute read  

Overview

Edge Delta provides two complementary approaches to threshold-based alerting:

ApproachLocationScopeUse Case
Pipeline TriggersEdge (within pipelines)Single pipelineLow-latency alerts for pipeline-specific conditions
MonitorsCentralized back endAll pipelinesCross-pipeline correlation and aggregated thresholds

This page covers Monitors, which are centralized back-end components. For edge-based alerting within individual pipelines, see Pipeline Triggers.

Monitors vs Pipeline Triggers

The following diagram illustrates how both alerting approaches work together:

A workload generates logs and metrics. Data flows through the pipeline where it can take two paths:

  1. Edge path (Pipeline Triggers): Metrics flow to a threshold node within the pipeline. If conditions are met, a signal is sent directly to a trigger destination (webhook, Slack, etc.). These alerts are specific to that single pipeline and provide the lowest latency response.

  2. Central path (Monitors): Data flows to the Edge Delta Destination, which archives logs, metrics, and patterns in the Edge Delta back end. Monitors then evaluate this aggregated data across all pipelines, enabling cross-pipeline correlation and organization-wide thresholds.

When to Use Monitors

Monitors are ideal when you need to:

  • Aggregate across pipelines: Detect patterns that span multiple environments, clusters, or services
  • Correlate cross-telemetry data: Combine metrics, logs, and patterns in a single evaluation
  • Monitor agent health: Track downed agents, crash loops, or pipeline issues across your fleet
  • Set organization-wide thresholds: Define alerts based on total error rates, combined throughput, or other aggregated metrics

Monitors generate signals that can be sent to third-party notification tools such as Teams, PagerDuty, or Slack. Unlike pipeline triggers, these alerts may represent conditions across multiple pipelines rather than a single source.

See Quickstart: Create a Monitor for a quickstart guide to creating a monitor.

Preparing Data for Cross-Telemetry Monitoring

Effective monitoring across multiple telemetry types requires proper data preparation in your Telemetry Pipelines. Structure and enrich your data to enable correlation and context-aware alerting.

Add Context with Tags

Use the Add Field processor to enrich telemetry with contextual tags. Apply tags selectively based on data relevance to enable precise filtering and correlation:

- type: ottl_transform
  metadata: '{"type":"add-field","name":"Add Environment Tag"}'
  condition: resource["k8s.namespace.name"] == "production"
  statements: set(attributes["environment"], "prod")

Tags create common dimensions across logs, metrics, and traces, enabling monitors to correlate events from different sources.

Extract Metrics from Logs

Convert log-embedded values into time-series metrics using the Extract Metric processor. For example, parse CPU utilization from system logs:

- type: extract_metric
  metadata: '{"name":"Extract CPU Metrics"}'
  extract_metric_rules:
  - name: system_cpu_usage
    description: CPU utilization from system logs
    unit: "%"
    gauge:
      value: Double(attributes["cpu_usage"])
    condition: attributes["log_type"] == "system"

Extracted metrics can be monitored alongside native metrics, providing unified visibility across all telemetry sources.

Monitor List

Click Monitors and select the List tab.

Monitor List page Monitor List page

The monitors List page lists all existing monitors and their current state. It displays the following columns:

  • Priority: The priority level of the monitor.
  • Status: The current state of the monitor.
  • Snoozed: Indicates if the monitor is currently snoozed.
  • Name: The name of the monitor.
  • Type: The monitor type (Metric Threshold, Log Threshold, etc.).
  • Actions: Options to manage the monitor.

There are four states (status) that a monitor can be in:

  • Alert: The monitor has detected a breach of its configured Alert threshold.
  • Warning: The monitor has detected a breach of its configured Warning threshold.
  • OK: The monitor has not detected a breach of any configured thresholds. This includes cases where source data exists but none meets the monitor’s conditions.
  • No Data: The monitor enters this state when:
    • There is absolutely no data for the entire evaluation window, or
    • More than 25% of the data is missing within the evaluation window (when “Require Full Window” is enabled)
    • When “Require Full Window” is disabled, the monitor only enters No Data state if all data buckets are missing (100% tolerance)

Triggered List

Click Monitors and select the Triggered tab.

Screenshot Screenshot

The Triggered tab shows monitors in the Alert, No Data, or Warning state. It displays the following columns:

  • Status: The current state of the monitor.
  • Snoozed: Indicates if the monitor is currently snoozed.
  • Name: The name of the monitor.
  • Type: The monitor type (Metric Threshold, Log Threshold, etc.).
  • Group: The group-by dimension values that triggered the alert.
  • Triggered: When the monitor entered its current state.
  • Actions: Options to manage the monitor.

Manage Monitors

You click the kebab (three vertical dots) menu in the Actions column to make changes to the monitor, mute it, or see the events matching its monitoring criteria.

Monitor actions menu Monitor actions menu

Monitor Limits

Monitors have the following operational limits:

Group By Limits

When using Group By to split monitor evaluations by dimensions like k8s.pod.name or service.name, there is a limit on the number of unique group combinations that can trigger alerts simultaneously per minute:

Monitor TypeMaximum Unique Group-By Combinations
Metric Threshold50
Log Threshold50
Metric Change50
Pattern Anomaly30

For example, if you have a Log Threshold monitor grouped by k8s.pod.name and 100 pods are sending data that would trigger an alert, only the top 50 will fire alerts at the same time.

Group Value Length Limit

When you configure a monitor with Group By dimensions, Edge Delta concatenates those values into a single string like k8s.pod.name:my-pod-name,service.name:payment-service. This combined string is limited to 942 bytes. If your combined group-by values exceed this limit, they are truncated in the Monitor Status page and in notifications.

Screenshot Screenshot

Slack Notification Limit

When sending notifications to Slack, the message body is truncated to 3,000 characters. If the message exceeds this limit, it is truncated and appended with ... *(truncated due to slack character limit)*.