Edge Delta Cluster Processor

Find patterns in logs.

See the latest version here.

Overview

This processor type finds patterns in logs, and then groups (or clusters) these patterns based on similarities. This processor populates the Patterns page.

Most users, especially new users, will have default processors already configured for their account; however, if your account does not have any existing monitors, then the Patterns page will be empty.

Example

cluster:
    name: clustering
    num_of_clusters: 100
    samples_per_cluster: 20
    reporting_frequency: 30s
    retention: 10m
    cpu_friendly: true
    throttle_limit_per_sec: 200
    filters:
      - info

Required Parameters

name

Enter a descriptive label for this processor.

When you create a workflow, you will use this label to enter your processor into the workflow.

name: clustering

num_of_clusters

This parameter sets the maximum number of clusters to generate for an input.

num_of_clusters: 100

reporting_frequency

This parameter sets the frequency to send clustering results to a streaming destination. These results include patterns and samples.

reporting_frequency: 1m

Optional Parameters

cpu_friendly

This parameter sets CPU rate limiting. Specifically, the agent will review the soft_cpu_limit parameter from Agent Settings and drop some percentage of events to keep agent’s CPU usage at less than the limit. By default, this parameter is disabled.

This parameter only applies to users who have more 1,000 logs per second.

Analyzing patterns in high-volume log environments may cause strains on your computing resources. As a result, by default, this processor type only processes 200 logs per source (such as a container or file) per second.

You can change this setting with the cpu_friendly and throttle_limit_per_sec parameters.

cpu_friendly: true

filters

Enter an existing filter to add to this input. To learn how to create a filter, see Filters.

filters:
  - extract_severity

include_pattern_info_in_cluster_sample

Enter true to include pattern information (pattern, pattern count, sentiment score) as tags in the cluster sample. The default value is false.

include_pattern_info_in_samples: true

retention

This parameter is a golang duration string that represents a cluster’s retention rate. Clusters that do not have any new logs within the retention period will dropped and will no longer be reported until logs appear again. For example, if you set this parameter at 10m, then clusters without new logs for the last 10 minutes will be dropped. The default retention rate is 1 hour (1h).

retention: 30m

samples_per_cluster

This parameter sets the number of sample events to report when providing cluster details.

samples_per_cluster: 20

throttle_limit_per_sec

This parameter sets a limit on the number of logs that can be clustered per second from a single source. If the cpu_friendly parameter is enabled, then this parameter will be ignored.

throttle_limit_per_sec: 200