Anomaly Detection
  • Dark
    Light

Anomaly Detection

  • Dark
    Light

Edge Delta automatically detects anomalies in observability data, in individual agents as well as in aggregate on the backend.  Site Reliability Engineers (SREs) and developer teams can receive alerts about anomalous behavior and see views designed to help with root cause analysis. This helps reduce the time needed to detect and resolve incidents.

Anomalies in Log Patterns

Once log patterns are streamed to the Edge Delta backend, monitors can be configured to detect anomalous behavior and trigger alerts to one or more notification channels.

Edge Delta provides two types of monitors for detecting anomalies in log patterns, the Skyline Pattern monitor and the Pattern Check monitor.

Skyline Pattern Monitor

The Skyline Pattern monitor uses a proprietary 'skyline' algorithm to detect unusual spikes in logs with negative sentiment.  Log patterns for a particular source (e.g. a Kubernetes namespace or controller) are analyzed in aggregate, and an alert will be triggered if there is an usual spike in either the total number of log messages with negative sentiment, or the number of unique negative patterns detected.

The algorithm is tuned to reduce false positives by accounting for repeated patterns (e.g. logs that result from a daily/weekly/monthly batch job) as well as normal fluctuations in log volume (e.g. increased traffic to a website during daytime hours).

Example anomaly detected by a Skyline Pattern monitor

Pattern Check Monitor

The Pattern Check monitor performs a similar analysis to the Skyline Pattern monitor, but at the level of an individual pattern.  It is useful for detecting spikes in individual patterns with negative sentiment, from both new as well as existing patterns.

Example anomaly detected by a Pattern Check monitor 

Anomalies in Metrics

After performing logs to metrics conversion, Edge Delta is able to detect anomalies in the data collected by individual agents as well as in data aggregated from multiple agents. 

Agent Processor Alerts

The Edge Delta agent can be configured to track the value of log metrics over time, detect anomalous values, and alert you if it finds any. 

For instance, you may want to alerted if there is unusually high frequency of log messages containing ERROR or EXCEPTION. To determine if the frequency is unusually high, the agent calculates an anomaly score between 0 to 100 using a proprietary algorithm, and if the metric value exceeds a defined threshold during a given interval, the value will be considered anomalous.

Processors with anomaly detection enabled

Depending on configuration, an alert may be sent (typically a Slack message or an email), and an anomaly capture may occur, resulting in raw logs around the time of the anomaly being sent to the Edge Delta backend and/or a 3rd party streaming destination.

Metrics Anomalies in the Edge Delta Web App

Metrics-based anomalies can be viewed in the Edge Delta web app on the Insights screen.  

Metrics-based anomalies, grouped by processor rule Click Investigate for an anomaly to bring up an Investigation view, which provides details about why an alert was triggered as well as contextual information such as related logs and metric values leading up to the anomaly.

Anomaly detected by the `exception_monitoring` processor 

Log pattern which matched the processor criteria and contributed to anomaly score 
Metric value at the time of anomaly 

Anomaly score exceeding threshold of 95

Metrics Alert Monitors

Since many production services run across multiple hosts, it is often useful to collect metric values in aggregate from all hosts, analyze them, and trigger alerts if a threshold is exceeded.

A metrics alert monitor can be configured to trigger when the aggregated metric value or anomaly score from many agent instances exceeds a defined threshold.  Click Create Alert on the Metrics view in the web app to define a metric alert. 

Create Alert command to set up a Metrics Alert monitor

 

Metric alert based on Anomaly Score

 When the threshold defined in a metrics alert monitor is exceeded, a notification is sent (via email, Slack, Pager Duty, etc.) with a link to an Investigation view, similar to that for a processor-detected metric anomaly.

Correlated Signal Monitor

Similar to the Skyline Pattern Monitor for log patterns, the Correlated Signal Monitor checks for instances where an unusually high number of metrics-based anomalies were detected.  

These anomalies may have been triggered by different processors and/or originate from different hosts, but the aggregate behavior is considered anomalous compared to a known baseline.


Correlated signal alert, grouped by processor

 

The same correlated signal alert, grouped by processor and host

Click Create Alert on the Insights screen to configure a correlated Signal monitor.




Was this article helpful?

Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.