Edge Delta Agent Settings
5 minute read
Overview
There are a number of global settings you can configure in the Pipeline configuration v3 yaml file. These are contained in the settings
section.
Pipeline configuration v2 uses the settings described in the v2 Agent Settings section.
Example
version: v3
settings:
tag: prod
log:
level: debug
persisting_cursor_settings:
path: /var/edgedelta/pos
file_name: cursor_file.json
flush_interval: 1m
archive_flush_interval: 5m
archive_max_byte_limit: "16MB"
source_discovery_interval: 5s
anomaly_tolerance: 0.1
anomaly_confidence_period: 1m
skip_empty_intervals: false
only_report_nonzeros: false
anomaly_coefficient: 10.0
item_buffer_flush_interval: 5s
item_buffer_max_byte_limit: 1MiB
multiline_max_size: 250
multiline_max_byte_size: "10KB"
max_incomplete_line_buffer_size: "10KB"
metric_column_opts:
drop_columns:
- name: docker_id
metric_categories:
- incoming_outgoing
- name: labels.*
exceptions:
- labels.app.value
- labels.somefield.*
Parameters
anomaly_coefficient
The anomaly_coefficient
parameter multiplies the final anomaly score by between 0 and 100. The higher the coefficient the higher the anomaly score will be. The default is 10. It can be set at the node level and/or dimension group level for some log_to_metric
nodes. It is optional.
settings:
tag: prod
anomaly_coefficient: 12
anomaly_confidence_period
The anomaly_confidence_period
parameter defines the period for which to ignore anomaly score calculations after a source is found. This helps prevent noise associated with a new source. It is defined with a duration and the default is 30m. It can be set at node level and/or dimension group level for some log_to_metric
nodes. It is optional.
settings:
tag: prod
anomaly_confidence_period: 1m
anomaly_tolerance
The anomaly_tolerance
parameter configures anomaly sensitivity. When anomaly_tolerance
is non-zero, anomaly scores are better handled in edge cases better where the standard deviation is small. The default is 0.01
. It can be set at node level and/or dimension group level for some log_to_metric
nodes. It can also be set in the global agent settings
. It is optional.
nodes:
- name: <node name>
type: log_to_metric
pattern: <regex pattern>
anomaly_tolerance: 0.03
settings:
tag: prod
anomaly_tolerance: 0.2
archive_flush_interval
The archive_flush_interval
parameter defines the interval at which logs are flushed and send to archive destinations. The default value is 30m. It is optional.
settings:
tag: prod
archive_flush_interval: 30m
archive_max_byte_limit
The archive_max_byte_limit
parameter defines the maximum bytes to buffer in memory until triggering an archive flush. When either archive_flush_interval
or archive_max_byte_limit
is reached, the agent flushes the buffered raw logs to configured archive destinations. The default byte size limit is 16MB
. It is optional.
settings:
tag: prod
archive_max_byte_limit: "16MB"
item_buffer_flush_interval
The item_buffer_flush_interval
parameter defines the interval after which item buffers will flush their contents. It is specified as a duration and the default is 5s.
settings:
tag: prod
item_buffer_flush_interval: 5s
item_buffer_max_byte_limit
The item_buffer_max_byte_limit
parameter defines the size limit that will trigger an item buffer flush. It is specified as a string and the default is 1MiB.
settings:
tag: prod
item_buffer_max_byte_limit: 2MiB
log
The log
parameter configures the severity level down to which the agent should populate its own log file. You use this log file to troubleshoot the agent itself. The configured level and more severe levels will be included. It is optional. Less severe levels will increase the log volume.
You specify one of the following levels in increasing order of severity:
- debug
- info
- warn
- error
- fatal
settings:
tag: prod
log:
level: debug
max_incomplete_line_buffer_size
The max_incomplete_line_buffer_size
parameter defines maximum data that can be kept in a buffered line separator. This is useful when receiving JSON formatted and large inputs. When a single line is larger than 10KB, the line_pattern
parameter can be used to separate inputs into valid JSON objects. It is specified as a string and the default value is 10KB.
settings:
tag: prod
max_incomplete_line_buffer_size: "20KB"
metric_column_opts
THe metric_column_opts
parameter defines options for metric columns. Currently only column dropping (drop_columns
)is supported. The drop_columns
parameter defines metric columns that will not be sent to metric outputs. This can be used to reduce high cardinality issues. It supports prefix matching with a wildcard *
as terminating character.
settings:
tag: prod
metric_column_opts:
drop_columns:
- name: docker_id
metric_categories:
- incoming_outgoing
- name: labels.*
exceptions:
- labels.app.value
- labels.somefield.*
multiline_max_size
The multiline_max_size
parameter defines the multiline buffer size in length. You increase the maximum line number for overflow cases where all buffered lines are otherwise dumped as single line. It is specifies as an integer and is optional.
settings:
tag: prod
multiline_max_size: 250
multiline_max_byte_size
The multiline_max_byte_size
parameter defines the multiline buffer size in bytes. Increase this maximum byte limit for overflow cases where all buffered lines are dumped as single line. It is specified as a data size string and the default is 10KB
. It is optional.
settings:
tag: prod
multiline_max_byte_size: "20KB"
only_report_nonzeros
The only_report_nonzeros
parameter configures the agent to only report non zero statistics. It is a Boolean value and the default is true
. It can be set at node level and/or dimension group level for some log_to_metric
nodes. It is optional.
settings:
tag: prod
only_report_nonzeros: false
persisting_cursor_settings
The persisting_cursor_settings
parameter configures persisting cursor for environments where no data can be lost during agent restarts. It is optional.
path
is the folder where the cursor file will be created.file_name
is the name of the cursor file.flush_interval
is the interval after which the file will be saved to from memory.
settings:
tag: prod
persisting_cursor_settings:
path: /var/edgedelta/pos
file_name: cursor_file.json
flush_interval: 1m
skip_empty_intervals
The skip_empty_intervals
parameter skips empty intervals so the anomaly scores are calculated based on rolling history of non-zero intervals. It is a Boolean value and the default is true
. It can be set at node level and/or dimension group level for some log_to_metric
nodes. It is optional.
settings:
tag: prod
skip_empty_intervals: true
nodes:
- name: <node name>
type: log_to_metric
pattern: <regex pattern>
skip_empty_intervals: true
source_discovery_interval
The source_discovery_interval
parameter configures the duration after which source discovery is invoked. The default value is 5s
and it is optional.
settings:
tag: prod
source_discovery_interval: 5s
tag
The tag
parameter labels the environment in which the agent is installed to help identify it, for example prod_us_west_2_cluster
. A custom value is recommended. It is specified as a string and is required.
settings:
tag: prod_us_west_2_cluster