Edge Delta Metrics Inventory
4 minute read
Overview
The Metric Inventory enables you to take stock of your metric traffic volume across all fleets and optimize your pipeline design accordingly. For example, you may discover a high volume metric that you are not very interested in using the Metric Inventory. You can easily remove that metric from your pipelines to reduce your environments processing and data handling overhead.
Inventory
Click the Metrics page and select the Inventory tab view the Metrics Inventory.
The Inventory lists all metrics handled by all your Edge Delta fleets sorted in descending order by Data Points (the total sum of metric data items for the look back period). You can filter the list using the options in the Filter pane, including any custom facets you configured in the Metric Explorer, or you can enter a metric search query manually.
Click a metric to see a list of all facets and all facet values associated with the metric across all Fleets.
You can click a Facet Value or a Label’s + icon to add or exclude it from the search query in the Metrics Inventory, or to add a custom facet for it.
Click Explore to open the current search query in the Metrics Explorer.
Analyze Metrics
You can use the DMAIC (Define, Measure, Analyze, Improve, Control) framework to analyze and optimize metrics (and your environments).
Define:
- Select Key Metrics: Determine which metrics are essential for your environment and workloads.
Measure:
- Aggregate Collected Data: Use Pipelines to consolidate metrics from various parts of your system to understand the overall performance picture.
- Use Dashboards: Implement dashboards to visualize metrics for easier analysis and real-time monitoring.
Analyze:
- Establish Baselines: Define normal ranges for your key performance indicators based on historical data.
- Monitor for Anomalies: Set up Monitor alerts for deviations from normal ranges to quickly identify potential issues in the environment.
- Analyze Usage Patterns: Study resource usage patterns to understand peak times, resource bottlenecks, and underutilized segments.
- Identify Bottlenecks: Focus on metrics that show latency, throttling, or errors.
Improve:
- Resource Optimization: Allocate resources appropriately based on the analyzed metrics.
Control:
- Regular Audits: Continually audit metrics to ensure configurations remain optimal as workloads and user demands change.
- Feedback Loops: Incorporate feedback from monitoring results into planning and executing further optimizations iteratively.
- Document Metrics and Changes: Maintain clear documentation of all monitored metrics, the reasons for choosing them, and any changes made as a result of optimizations.
Kubernetes Metrics Example
The ed_k8s_metric_kubelet_runtime_operations_duration_seconds.histogram
metric is a high volume metric in the preceding examples. It is generated by the Kubelet and ingested by the Kubernetes Metrics source node.
Metrics like *_duration_seconds.sum
, *_duration_seconds.count
, and their histogram
counterparts essentially describe the same phenomenon in different forms.
Depending on the granularity you need, you might choose to keep just the histogram or the sum
and count
. Histograms provide more detailed insights into distribution but may not be necessary if you’re only concerned with average or total durations.
The kubelet_runtime_operations_duration_seconds
group of metrics provides similar kinds of insights as the cgroup_manager_duration_seconds
and pod_worker_duration_seconds
. If you’re already monitoring runtime operation durations per operation type, you might not need as much granularity on manager-specific durations unless you are troubleshooting specific issues.
Metrics like container_cpu_usage_seconds
, node_cpu_seconds
, process_cpu_seconds
, etc., provide overlapping CPU usage information. Depending on whether you want container-level, node-level, or process-level granularity, you might not need all three sets.
Similarly, container_memory_usage_bytes
, node_memory_mem_*_bytes
overlap in the kind of memory insights they provide.
Network usage metrics are available both at the container level (e.g., container_network_*
) and node level (e.g., node_network_*
). Depending on your use case, you might not need both levels of granularity.
Remove a Metric
Depending on the source of a metric, you remove it in one of multiple ways:
- Add an exclude parameter to the Kubernetes Metrics source node using the facet values you identified in the Inventory.
- Remove it from a Log to Metric node configuration.
- Configure the data source to not emit it.