Edge Delta Cluster Processor
3 minute read
See the latest version here.
Overview
This processor type finds patterns in logs, and then groups (or clusters) these patterns based on similarities. This processor populates the Patterns page.
Most users, especially new users, will have default processors already configured for their account; however, if your account does not have any existing monitors, then the Patterns page will be empty.
Example
cluster:
name: clustering
num_of_clusters: 100
samples_per_cluster: 20
reporting_frequency: 30s
retention: 10m
cpu_friendly: true
throttle_limit_per_sec: 200
filters:
- info
Required Parameters
name
Enter a descriptive label for this processor.
When you create a workflow, you will use this label to enter your processor into the workflow.
name: clustering
num_of_clusters
This parameter sets the maximum number of clusters to generate for an input.
num_of_clusters: 100
reporting_frequency
This parameter sets the frequency to send clustering results to a streaming destination. These results include patterns and samples.
reporting_frequency: 1m
Optional Parameters
cpu_friendly
This parameter sets CPU rate limiting. Specifically, the agent will review the soft_cpu_limit
parameter from Agent Settings and drop some percentage of events to keep agent’s CPU usage at less than the limit. By default, this parameter is disabled.
This parameter only applies to users who have more 1,000 logs per second.
Analyzing patterns in high-volume log environments may cause strains on your computing resources. As a result, by default, this processor type only processes 200 logs per source (such as a container or file) per second.
You can change this setting with the cpu_friendly
and throttle_limit_per_sec
parameters.
cpu_friendly: true
filters
Enter an existing filter to add to this input. To learn how to create a filter, see Filters.
filters:
- extract_severity
include_pattern_info_in_cluster_sample
Enter true to include pattern information (pattern, pattern count, sentiment score) as tags in the cluster sample. The default value is false.
include_pattern_info_in_samples: true
retention
This parameter is a golang duration string that represents a cluster’s retention rate. Clusters that do not have any new logs within the retention period will dropped and will no longer be reported until logs appear again. For example, if you set this parameter at 10m, then clusters without new logs for the last 10 minutes will be dropped. The default retention rate is 1 hour (1h).
retention: 30m
samples_per_cluster
This parameter sets the number of sample events to report when providing cluster details.
samples_per_cluster: 20
throttle_limit_per_sec
This parameter sets a limit on the number of logs that can be clustered per second from a single source. If the cpu_friendly
parameter is enabled, then this parameter will be ignored.
throttle_limit_per_sec: 200