MySQL Slow Query Pack

This is a MySQL slow query pack that ingests logs, extracts key fields, and structures the data for easier analysis

Edge Delta Pipeline Pack for MySQL Slow Query Log

Overview

The MySQL Slow Query pack efficiently processes MySQL slow query logs, providing detailed insights into database performance. It ingests logs, extracts key fields, and structures the data for easier analysis. By clustering log patterns, converting query execution details into metrics, and categorizing this information by database, user, and source, this pack enables comprehensive monitoring of query performance, helping to optimize database operations and identify potential bottlenecks.

Pack Description

1. Data Ingestion

The data flow starts with the compound_input node. This node serves as the entry point into the pipeline, capturing incoming MySQL slow query logs for processing.

2. Field Extraction

Logs then flow into the grok_extract_fields node, a Grok node. This node uses a custom-defined pattern to extract fields like timestamp, username, db_name, source_ip_v4, source_ip_v6, etc., structuring them as attributes. Converting unstructured log data into structured data allows for easier searching, analysis, and visualization.

3. Timestamp Transformation

The log_transform_timestamp node, a Log Transform node, updates log entries by converting the timestamp attribute to Unix Milliseconds format and setting it into the item["timestamp"] field. This transformation ensures that the original log timestamp is preserved.

  - name: log_transform_timestamp
    type: log_transform
    transformations:
      - field_path: item["timestamp"]
        operation: upsert
        value:
          convert_timestamp(item["attributes"]["timestamp"], "2006-01-02T15:04:05.999999Z",
          "Unix Milli")

4. Log Pattern Clustering

Logs are processed by the log_to_pattern node, a Log to Pattern node, to generate clusters representing common log patterns.

  - name: log_to_pattern
    type: log_to_pattern
    num_of_clusters: 10
    samples_per_cluster: 5

This node distinguishes between various log patterns, providing a structural understanding of log data which can assist in anomaly detection or refining alert mechanisms.

5. Query Time Metric Conversion

Logs are further processed by the query_time_l2m node, a Log to Metric node, to convert query_time information into metrics.

  - name: query_time_l2m
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: latency
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["db_name"]
        field_numeric_dimension: item["attributes"]["query_time"]
        custom_suffix: by_database
        enabled_stats:
          - count
          - min
          - max
          - p95
          - p99
      - field_dimensions:
          - item["attributes"]["username"]
        field_numeric_dimension: item["attributes"]["query_time"]
        custom_suffix: by_user
        enabled_stats:
          - count
          - min
          - max
          - p95
          - p99
      - field_dimensions:
          - item["attributes"]["source_ip_v4"]
          - item["attributes"]["source_ip_v6"]
        field_numeric_dimension: item["attributes"]["query_time"]
        custom_suffix: by_source
        enabled_stats:
          - count
            - min
            - max
            - p95
            - p99

The dimensions provide insights into how queries perform across different categories:

  • By Database (db_name): Enables performance tracking of queries per database, highlighting which databases are handling slower queries.
  • By User (username): Allows you to discern whether certain users generate heavier loads on the system.
  • By Source IP (source_ip_v4, source_ip_v6): Facilitates the identification of network sources contributing to the load, helping in network-level optimizations.

6. Lock Time Metric Conversion

Similarly, the lock_time_l2m node processes lock_time data to provide metrics concerning database locking performance issues.

  - name: lock_time_l2m
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: query
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["db_name"]
        field_numeric_dimension: item["attributes"]["lock_time"]
        custom_suffix: by_database
        enabled_stats:
          - count
          - min
          - max
          - p95
          - p99
      - field_dimensions:
          - item["attributes"]["username"]
        field_numeric_dimension: item["attributes"]["lock_time"]
        custom_suffix: by_user
        enabled_stats:
          - count
          - min
          - max
          - p95
          - p99
      - field_dimensions:
          - item["attributes"]["source_ip_v4"]
          - item["attributes"]["source_ip_v6"]
        field_numeric_dimension: item["attributes"]["lock_time"]
        custom_suffix: by_source
        enabled_stats:
          - count
          - min
          - max
          - p95
          - p99

7. Rows Sent Metric Conversion

The rows_sent_l2m node transforms rows_sent data into metrics for understanding query returns.

  - name: rows_sent_l2m
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: query
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["db_name"]
        field_numeric_dimension: item["attributes"]["rows_sent"]
        custom_suffix: by_database
        enabled_stats:
          - count
          - sum
      - field_dimensions:
          - item["attributes"]["username"]
        field_numeric_dimension: item["attributes"]["rows_sent"]
        custom_suffix: by_user
        enabled_stats:
          - count
          - sum
      - field_dimensions:
          - item["attributes"]["source_ip_v4"]
          - item["attributes"]["source_ip_v6"]
        field_numeric_dimension: item["attributes"]["rows_sent"]
        custom_suffix: by_source
        enabled_stats:
          - count
          - sum

8. Rows Examined Metric Conversion

Similarly, the rows_examined_l2m node converts rows_examined data into metrics to evaluate query performance based on resource consumption.

  - name: rows_examined_l2m
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: query
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["db_name"]
        field_numeric_dimension: item["attributes"]["rows_examined"]
        custom_suffix: by_database
        enabled_stats:
          - count
          - sum
      - field_dimensions:
          - item["attributes"]["username"]
        field_numeric_dimension: item["attributes"]["rows_examined"]
        custom_suffix: by_user
        enabled_stats:
          - count
          - sum
      - field_dimensions:
          - item["attributes"]["source_ip_v4"]
          - item["attributes"]["source_ip_v6"]
        field_numeric_dimension: item["attributes"]["rows_examined"]
        custom_suffix: by_source
        enabled_stats:
          - count
          - sum

9. Logging and Metrics Outputs

  • logs_output node collects all processed logs for further analysis.
  • patterns_output collects output patterns from the log_to_pattern node, ensuring structural insights into query execution operations.
  • query_time_metrics_output, lock_time_metrics_output, rows_sent_metrics_output, and rows_examined_metrics_output nodes capture metrics for monitoring query performance and operational efficiency.

Sample Input

# Time: 2024-09-17T17:58:31.649167Z
# User@Host: Quigley5177[root] @ localhost [d9f5:c1c1:9699:bff4:b8ae:91b3:7f67:8fa8] Id: 469835
# Query_time: 120.408058 Lock_time: 0.011555 Rows_sent: 268756 Rows_examined: 8550280
SELECT article, dealer, price
FROM shop
WHERE price=(SELECT MAX(price) FROM shop);
# Time: 2024-09-17T17:58:31.649201Z
# User@Host: Hartmann6021[example] @ 180.68.18.209 [26aa:ed3d:4d9d:9f42:45c9:aa7e:dea2:620c] Id: 917215
# Query_time: 54.781643 Lock_time: 0.013071 Rows_sent: 6396691 Rows_examined: 9678866
CREATE TABLE animals (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name CHAR(30) NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO animals (name) VALUES
('dog'),('cat'),('penguin'),
('lax'),('whale'),('ostrich');
SELECT * FROM animals;