Edge Delta Log to Metric Node

Extract metrics from logs using the Edge Delta Log to Metric Node.

Overview

The Log to Metric Node evaluates the body field for matching patterns and generates metrics.

For a detailed walkthrough, see the Create Metrics from Logs page.

Example Configuration

Different types of metrics are supported:

Occurrence Count

A simple count of occurrences of logs that match the pattern, for example a metric called timout.count to count logs containing connection timeout. The count stat is enabled by entering count and pressing Enter. In the test pane, on the Processor tab, you can drop in sample logs and view the output metric items.

YAML Version:

nodes:
- name: occurence_count
  type: log_to_metric
  pattern: (?i)connection timeout
  interval: 5s
  skip_empty_intervals: false
  only_report_nonzeros: false
  metric_name: connection_timeout
  enabled_stats:
  - count

Input logs:

2021-09-10 12:05:00 ERROR node7 experienced a connection timeout
2021-09-10 12:06:00 WARN connection timeout while trying to reach the database node7
2021-09-10 12:07:00 INFO node7 attempting to reconnect after a connection timeout
2021-09-10 12:08:00 ERROR node7 - connection timeout during data synchronization
2021-09-10 12:09:00 DEBUG Checked connection status to node7, no timeout detected

Out of these sample logs, logs 1, 2, 3 and 4 contain the phrase “connection timeout” and are counted by the occurrence count node for a value of 4.

Output in Test Pane:

{
  "_type": "metric"
  "gauge": {
    "value": 4
  }
  "kind": "gauge"
  "name": "connection_timeout.count"
  "resource": {    }
  "start_timestamp": 1708445624790
  "timestamp": 1708445629790
  "unit": "1"
  "_stat_type": "count"
}

Output in Metric Explorer: You can select the connection_timeout metric in the Metrics Explorer.

Numeric Capture

This example matches any part of a log line that contains Response Time: followed by one or more digits (\d+), and ends with ms. It captures the numeric value representing the response time in milliseconds. The numeric values captured are turned into metrics with the base name response. In the test pane, on the Processor tab, you can drop in sample logs and view the output metric items.

Numeric captures must use a named capture group with a corresponding numeric dimension.

YAML Version:

nodes:
- name: numeric_capture
  type: log_to_metric
  pattern: 'Response Time: (?P<response>\d+)ms'
  interval: 1m0s
  skip_empty_intervals: false
  only_report_nonzeros: false
  enabled_stats:
  - min
  - max
  - avg
  dimension_groups:
  - numeric_dimension: response

Input logs:

WARN Slow response detected, Response Time: 300ms
DEBUG Request finished, Response Time: 150ms
ERROR Service timeout, Response Time: 500ms
INFO Processed request with Response Time: 100ms

Output in Node Test Pane: This is one of the output metrics in the node test pane.

{
  "_type": "metric"
  "gauge": {
    "value": 262.5
  }
  "kind": "gauge"
  "name": "numeric_capture_response.avg"
  "resource": {    }
  "start_timestamp": 1708445617679
  "timestamp": 1708445677679
  "unit": "1"
  "_stat_type": "avg"
}

Output in Metric Explorer:

Dimension Counter

If named captures in the regex pattern are dimensions, and dimension groups are given, then dimension occurrence stats are generated. In the test pane, on the Processor tab, you can drop in sample logs and view the output metric items.

YAML Version:

nodes:
- name: dimension_counter
  type: log_to_metric
  pattern: HTTP/1.1" (?P<status_code>\d{3})
  interval: 1m0s
  skip_empty_intervals: false
  only_report_nonzeros: false
  metric_name: dimension_counter
  enabled_stats:
  - count
  dimension_groups:
  - dimensions:
    - status_code

Input logs:

192.168.1.1 - - [12/Apr/2023:15:31:23 +0000] "GET /index.html HTTP/1.1" 200 4523
192.168.1.2 - - [12/Apr/2023:15:32:41 +0000] "POST /login HTTP/1.1" 401 1293
192.168.1.3 - - [12/Apr/2023:15:33:09 +0000] "GET /does-not-exist HTTP/1.1" 404 523
192.168.1.4 - - [12/Apr/2023:15:34:46 +0000] "GET /server-error HTTP/1.1" 500 642
192.168.1.1 - - [12/Apr/2023:15:35:22 +0000] "GET /contact HTTP/1.1" 200 2312

Output This is one of the output metrics in the node test pane.

{
  "_type": "metric"
  "attributes": {
    "status_code": "500"
  }
  "gauge": {
    "value": 1
  }
  "kind": "gauge"
  "name": "dimension_counter.count"
  "resource": {
    "ed.conf.id": "87654321-1321-69874-9456-s5123456h7"
    "ed.org.id": "12345678-1s2d-6f5d4-9632-s5d3f6g9h7"
    "ed.tag": "ed_parallel"
    "host.ip": "10.0.0.1"
    "host.name": "ED_TEST"
    "src_type": ""
  }
  "start_timestamp": 1707743964047
  "timestamp": 1707744024047
  "unit": "1"
  "_stat_type": "count"
}

Notice the status_code dimension is saved as an attribute. You can copy this path from the node test pane to use it elsewhere, such as to specify a Prometheus Output label:

This type of metric can be viewed using a third party metrics integration. For example, suppose you have a Prometheus output with a label configured to pull its value from the status_code attribute:

nodes:
- name: prometheus_exporter_output
  type: prometheus_exporter_output
  port: 8087
  labels:
  - name: platform_label
    path: item["attributes"]["status_code"]

The Prometheus output creates a metric as follows:

edgedelta_dimension_counter_count{container="edgedelta-agent", endpoint="prom", instance="172.18.0.2:8087", job="edgedelta-metrics", namespace="edgedelta", pod="edgedelta-8pgqc", service="edgedelta-metrics", status_code_label="401"}

Note the addition of the final field called status_code_label that contains the status_code attribute’s value.

The output in Grafana can then be viewed with these queries:

edgedelta_dimension_counter_count{status_code_label="200"}
edgedelta_dimension_counter_count{status_code_label="401"}

and so forth for 404, 500 and 200.

Dimension Numeric Capture

If both dimension and numeric captures are defined in the regex pattern and also in one of the dimension groups, then numeric stats per dimension and per numeric value are generated. This example will capture metrics about the average delivery time and the count of orders per platform for performance monitoring and analysis.

YAML Version:

- name: dimension_numeric_capture
  type: log_to_metric
  pattern: platform=(?P<platform>\w+).*delivery_time=(?P<delivery_time>\d+)
  interval: 1m0s
  skip_empty_intervals: false
  only_report_nonzeros: false
  enabled_stats:
  - count
  - avg
  dimension_groups:
  - dimensions:
    - platform
    numeric_dimension: delivery_time

Input logs: Suppose following logs are fed into the pipeline:

2023-04-15 12:05:23.123123 platform=mobile_app customer_id=12345 order_id=67890 delivery_time=35 order_status=delivered
2023-04-15 12:06:45.456456 platform=mobile_app customer_id=23456 order_id=78901 delivery_time=28 order_status=delivered
2023-04-15 12:07:17.789789 platform=web customer_id=34567 order_id=89012 delivery_time=42 order_status=delivered
2023-04-15 12:08:09.987987 platform=mobile_app customer_id=45678 order_id=90123 delivery_time=30 order_status=delivered
2023-04-15 12:10:22.654321 platform=web customer_id=56789 order_id=12345 delivery_time=27 order_status=delivered

Output This is one of the output metrics in the node test pane.

{
  "_type": "metric"
  "attributes": {
    "platform": "mobile_app"
  }
  "gauge": {
    "value": 31
  }
  "kind": "gauge"
  "name": "dimension_numeric_capture_delivery_time.avg"
  "resource": {
    "ed.conf.id": "87654321-1321-69874-9456-s5123456h7"
    "ed.org.id": "12345678-1s2d-6f5d4-9632-s5d3f6g9h7"
    "ed.tag": "ed_parallel"
    "host.ip": "10.0.0.1"
    "host.name": "ED_TEST"
    "src_type": ""
  }
  "start_timestamp": 1707743460870
  "timestamp": 1707743520870
  "unit": "1"
  "_stat_type": "avg"
}

Notice the platform dimension is saved as an attribute. You can copy this path from the node test pane to use it elsewhere, such as to specify a Prometheus Output label:

This type of metric can be viewed using a third party metrics integration. For example, suppose you have a Prometheus output with a label configured to pull its value from the platform attribute:

- name: prometheus_exporter_output
  type: prometheus_exporter_output
  port: 8087
  labels:
  - name: platform_label
    path: item["attributes"]["platform"]

The Prometheus output creates a metric as follows:

edgedelta_dimension_numeric_capture_delivery_time_avg{container="edgedelta-agent",endpoint="prom",instance="172.18.0.2:8087",job="edgedelta-metrics",namespace="edgedelta",platform_label="mobile_app",pod="edgedelta-8pgqc",service="edgedelta-metrics"}

Note the addition of the field called platform_label that contains the platform attribute’s value.

The output in Grafana can then be viewed with these queries:

edgedelta_dimension_numeric_capture_delivery_time_avg{platform_label="mobile_app"}
edgedelta_dimension_numeric_capture_delivery_time_avg{platform_label="web"}
edgedelta_dimension_numeric_capture_delivery_time_count{platform_label="mobile_app"}
edgedelta_dimension_numeric_capture_delivery_time_count{platform_label="web"}

Field Path Capture

The log to metric node can be configured to extract metrics from dimensions that you specify with a field path. You can specify a capture group dimension and field path dimensions, but you can only specify one numeric dimension: either using a capture group or by using the field_numeric_dimension.

Example 1

In this example, a Golang regex pattern is used to capture the service name as an attribute, and CEL macros are used to specify a dimension field (host name) and a numeric dimension field (latency). Only the average metric is configured.

YAML Version:

- name: field_path_dimensions
  type: log_to_metric
  pattern: '"service": "(?P<service>\w+)"'
  interval: 1m0s
  skip_empty_intervals: false
  only_report_nonzeros: false
  enabled_stats:
  - avg
  dimension_groups:
  - dimensions:
    - service
    field_dimensions:
    - item["resource"]["host.name"]
    field_numeric_dimension: json(item["body"]).details.response_time_ms

Input logs:

Suppose following logs are fed into the pipeline:

{"timestamp": "2024-12-27T14:53:41Z", "level": "info", "service": "webserver_node18", "details": {"status_code": 200, "response_time_ms": 150, "user": {"id": "U12345", "name": "Raptor5166"}, "session": {"id": "S98765", "cart_items": 3}}}
{"timestamp": "2024-12-27T14:54:15Z", "level": "error", "service": "database_node18", "details": {"status_code": 500, "query_time_ms": 3000, "user": {"id": "Q54321", "name": "dbsys"}, "error": {"code": "DB_TIMEOUT", "message": "Query execution timed out"}}}
{"timestamp": "2024-12-27T14:55:05Z", "level": "info", "service": "authentication_node18", "details": {"status_code": 200, "auth_time_ms": 50, "user": {"id": "U67890", "name": "Condor4"}, "auth": {"method": "OAuth2"}}}
{"timestamp": "2024-12-27T14:56:30Z", "level": "warn", "service": "webserver_node18", "details": {"status_code": 404, "response_time_ms": 25, "user": {"id": "U33456", "name": "owl_a1"}, "request": {"method": "GET", "resource": "/missing-page"}}}

Output

The avg metric output is visible in the node test pane:

{
  "_type": "metric"
  "attributes": {
    "host.name": "ED_TEST"
    "service": "webserver_node18"
  }
  "gauge": {
    "value": 87.5
  }
  "kind": "gauge"
  "name": "field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim.avg"
  "resource": {
    "ed.conf.id": "12345678987654321"
    "ed.filepath": "test/file/path"
    "ed.org.id": "98765432123456789"
    "ed.tag": "ed_parallel"
    "host.ip": "10.0.0.1"
    "host.name": "ED_TEST"
    "src_type": "file_input"
  }
  "start_timestamp": 1708510673693
  "timestamp": 1708510733693
  "unit": "1"
  "_stat_type": "avg"
}

Note: You can copy the attribute paths from the node test pane to use them elsewhere, such as to specify a Prometheus Output label.

For example, suppose you have a Prometheus output with labels configured to pull values from the attributes service and host.name:

- name: prometheus_exporter_output
  type: prometheus_exporter_output
  port: 8087
  labels:
  - name: service_label
    path: item["attributes"]["service"]
  - name: host_name_label
    path: item["attributes"]["host.name"]

The Prometheus output creates a metric as follows:

edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_avg{container="edgedelta-agent", endpoint="prom", host_name_label="parallelcluster-control-plane", instance="172.18.0.2:8087", job="edgedelta-metrics", namespace="edgedelta", pod="edgedelta-c2brx", service="edgedelta-metrics", service_label="webserver_node18"}

Note the addition of the field called service_label and host_name_label.

The output in Grafana can then be viewed with this query, and you can switch between services and hosts using the label selector:

edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_avg{service_label="webserver_node18", host_name_label="parallelcluster-control-plane"}

Example 2

In this example a catch-all pattern is used and CEL macros are used to specify a dimension field and a numeric dimension field.

YAML Version:

- name: field_path_dimensions
  type: log_to_metric
  pattern: .*
  interval: 1m0s
  skip_empty_intervals: false
  only_report_nonzeros: false
  dimension_groups:
  - field_dimensions:
    - json(item["body"]).service
    field_numeric_dimension: json(item["body"]).details.response_time_ms

Input logs:

Suppose following logs are fed into the pipeline:

{"timestamp": "2024-12-27T14:53:41Z", "level": "info", "service": "webserver_node18", "details": {"status_code": 200, "response_time_ms": 150, "user": {"id": "U12345", "name": "Raptor5166"}, "session": {"id": "S98765", "cart_items": 3}}}
{"timestamp": "2024-12-27T14:54:15Z", "level": "error", "service": "database_node18", "details": {"status_code": 500, "query_time_ms": 3000, "user": {"id": "Q54321", "name": "dbsys"}, "error": {"code": "DB_TIMEOUT", "message": "Query execution timed out"}}}
{"timestamp": "2024-12-27T14:55:05Z", "level": "info", "service": "authentication_node18", "details": {"status_code": 200, "auth_time_ms": 50, "user": {"id": "U67890", "name": "Condor4"}, "auth": {"method": "OAuth2"}}}
{"timestamp": "2024-12-27T14:56:30Z", "level": "warn", "service": "webserver_node18", "details": {"status_code": 404, "response_time_ms": 25, "user": {"id": "U33456", "name": "owl_a1"}, "request": {"method": "GET", "resource": "/missing-page"}}}

Output

One of the metrics outputs, visible in the node test pane, would be as follows:

{
  "_type": "metric"
  "attributes": {
    "field_path_dimensions-log-to-metric-buffer_dg_0_dim_0": "webserver_node18"
  }
  "gauge": {
    "value": 87.5
  }
  "kind": "gauge"
  "name": "field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim.avg"
  "resource": {    }
  "start_timestamp": 1708422058420
  "timestamp": 1708422118420
  "unit": "1"
  "_stat_type": "avg"
}

Note: A name for the json(item["body"]).service field was not specified in the Golang regex pattern using a named capture group so a default field name is generated. You can copy this field name path from the node test pane to use it elsewhere, such as to specify a Prometheus Output label:

For example, suppose you have a Prometheus output with a label configured to pull its value from the attribute name generated by the node:

- name: prometheus_exporter_output
  type: prometheus_exporter_output
  port: 8087
  labels:
  - name: service_label
    path: item["attributes"]["field_path_dimensions-log-to-metric-buffer_dg_0_dim_0"]

The Prometheus output creates a metric as follows:

edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_min", container="edgedelta-agent", endpoint="prom", instance="172.18.0.2:8087", job="edgedelta-metrics", namespace="edgedelta", pod="edgedelta-c2brx", service="edgedelta-metrics", service_label="webserver_node18"}

Note the addition of the field called service_label that contains the field_path_dimensions-log-to-metric-buffer_dg_0_dim_0 attribute’s value.

The output in Grafana can then be viewed with these queries, and you can switch between service labels:

 edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_avg{service_label="webserver_node18"}
 edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_count{service_label="webserver_node18"}
 edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_max{service_label="webserver_node18"}
 edgedelta_field_path_dimensions_field_path_dimensions_log_to_metric_buffer_dg_0_num_dim_min{service_label="webserver_node18"}

Required Parameters

name

A descriptive name for the node. This is the name that will appear in Visual Pipelines and you can reference this node in the yaml using the name. It must be unique across all nodes. It is a yaml list element so it begins with a - and a space followed by the string. It is a required parameter for all nodes.

nodes:
  - name: <node name>
    type: <node type>

type: log_to_metric

The type parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.

nodes:
  - name: <node name>
    type: <node type>

pattern

The pattern parameter is used to match log items in the body field. It is specified as a Golang regex expression and it can include a capture group. If one or more dimension groups are defined, there should be at least one capture group definition. A pattern is required.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>    

Optional Parameters

dimension_groups

The dimension_groups parameter is used to group attributes for metrics. There can be one or more dimension groups. It is specified with child dimensions elements. It is optional. The dimensions parameter specifies the names from capture groups that will be used in the metric name or attribute.

It can take a number of options that apply only to that dimension:

  • custom_suffix A suffix to append to the metric name.
  • numeric_dimension The metric value won’t be accepted as 1.0 but rather the value captured from the given dimension
  • enabled_stats Statistics to be reported. Valid options are: count, sum, avg, min, max, p25, p75, p95, p99, stddev, anomaly1, anomaly2, anomalymin. The anomalymin option takes min of anomaly1 and anomaly2. This is useful to reduce the alert noise.
  • histogram_limit The maximum number of histograms per reporter.
  • interval Interval to report metrics. Default value is 1m.
  • retention Retention for storing reported metrics to calculate anomaly scores. Default value is 3h.
nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    dimension_groups:
    - dimensions: ["service"]
      numeric_dimension: "duration"
      custom_suffix: "by_duration"

enabled_stats

The enabled_stats parameter specifies the statistics that should be reported. It is specified as a list of strings and is optional. Valid options are:

  • count - the number of instances matched.
  • sum - the sum of matched values.
  • avg - the average (mean) matching value.
  • min - the smallest matching value.
  • max - the largest matching value.
  • p25 - count of values in the 25th percentile.
  • p75 - count of values in the 75th percentile.
  • p95 - count of values in the 95th percentile.
  • p99 - count of values in the 99th percentile.
  • stddev - the standard deviation.
  • anomaly1 - the proprietary Edge Delta anomaly score 1.
  • anomaly2 - the proprietary Edge Delta anomaly score 2.
  • anomalymin - the min of anomaly1 and anomaly2. This is useful to reduce the alert noise.

The count, anomaly1 and anomaly2 metrics are generated for occurrence captures. Whereas count, min, max, avg, anomaly1 and anomaly2 metrics are generated for numeric captures.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    enabled_stats: 
    - <statistic type>
    - <statistic type>

field_dimensions

The field_dimensions parameter points to string fields within your payloadDimensions using CEL expressions or bracket notation, such as item[\"attributes\"][\"dimension\"]. This field is useful when working with parsed JSON data. The field_dimensions parameter can be defined alongside the dimensions that come from capture groups in the Golang regex pattern.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    dimension_groups:
    - dimensions:
      - <capture group name>
      field_dimensions:
      - item["resource"]["field.name"]
      field_numeric_dimension: json(item["body"]).details.field

field_numeric_dimension

The field_numeric_dimension parameter defines a numeric field within your payload using a CEL expression or bracket notation, such as item[\"attributes\"][\"numeric_dimension\"]. This field is useful when working with parsed JSON data. You can specify either a numeric dimension or a field numeric dimension, not both.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    dimension_groups:
    - dimensions:
      - <capture group name>
      field_dimensions:
      - item["resource"]["field.name"]
      field_numeric_dimension: json(item["body"]).details.field

interval

The interval parameter specifies the reporting interval for the statistics that the node will generate. It will collect values for the duration of the interval before calculating metrics such as the average. It is specified as a duration and the default is 1 minute. It is optional.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    interval: 2m

metric_name

The metric_name parameter specifies a custom name for the generated metric. It is specified as a string and the default, if not specified, is to use the node name. It is optional.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    metric_name: <custom name>

retention

The retention parameter specifies how far back to look to generate anomaly scores. A short retention period will be more sensitive to spikes in metric values. It is specified as a duration and the default is 3 hours. It is optional.

nodes:
  - name: <node name>
    type: log_to_metric
    pattern: <regex pattern>
    retention: <duration>

See Also

Create Metrics from Logs