Edge Delta Metrics List
Metrics handled by Edge Delta.
29 minute read
This page lists metrics in Edge Delta. For accurate interpretation and pipeline design you should examine your actual data using the metrics inventory, node tests, and the debug node.
Edge Delta Agent Metrics
Metric Name | Description |
---|---|
ed.agent.cpu.milicores | Measures the CPU usage of the Edge Delta agent in millicores. This metric is useful for understanding how much CPU is being consumed by the agent - which can help identify performance bottlenecks or inefficiencies. The common name is agent_cpu_millicores.value |
ed.agent.gc.count | Represents the total number of garbage collection operations carried out by the Edge Delta agent. This metric helps in understanding the frequency of garbage collection processes. The common name is agent_gc_count.value |
ed.agent.gc.forced_count | Indicates the number of times garbage collection was manually triggered within the Edge Delta agent. This metric is useful for tracking how often forced garbage collection occurs. The common name is agent_gc_forced_count.value |
ed.agent.gc.pause_time | Refers to the duration of garbage collection pauses in milliseconds. This metric is useful for diagnosing performance issues related to the time it takes the garbage collector to complete its cycle within the Edge Delta agent. The common name is agent_gc_pause_ms.value |
ed.agent.gc.target | The garbage collection target of the Edge Delta agent. This metric helps in monitoring and managing the garbage collection process within the agent. The common name is agent_gc_target.value |
ed.agent.go.routine.value | Tracks the number of goroutines in use by the Edge Delta agent. This metric is useful for understanding how many concurrent operations or threads the agent is handling. which can indicate how efficiently the agent is managing memory. The common name is agent_num_goroutine.value . |
ed.agent.memory.allocation | The memory allocation of the Edge Delta agent. This metric is used to monitor the amount of memory that the agent is currently using. which can be indicative of memory management strategies or issues. The common name is agent_mem_alloc.value |
ed.agent.memory.to_be_freed | The memory that is marked to be freed by the Edge Delta agent. This metric helps in understanding how much memory is expected to be released soon by the agent. The common name is agent_mem_to_be_freed.value |
ed.agent.memory.virtual | The memory that is reserved for the Edge Delta agent - including both the physical and swap memory that has been allocated. It provides insights into the total address space reserved by the agent. The common name is agent_mem_virtual.value |
ed.host.cpu.process_count | Tracks the number of processes currently running on the host. This provides insights into the system’s load and can indicate resource usage trends. The common name is process_count.value |
ed.host.cpu.system.average | Tracks the average percentage of CPU time consumed by system processes across all CPUs on a host. This offers an overview of the CPU resources used for system-level tasks over time. The common name is cpu_system_avg.value |
ed.host.cpu.user.average | Measures the average percentage of CPU time used by user-level processes across all CPUs on a host. This metric provides insights into overall CPU usage by applications running in user mode. The common name is cpu_user_avg.value |
ed.host.cpu#.system.average | The metric ed.host.cpu#.system.average represents the average percentage of CPU time spent on system processes on a specific host. This metric provides insights into how much CPU resources are allocated for operating system tasks over a period of time. The common name is cpu#_system_avg.value |
ed.host.cpu#.system.percent | The metric ed.host.cpu#.system.percent measures the percentage of CPU time spent on system processes on a specific host. This metric helps in understanding the proportion of CPU resources used by the operating system for various internal tasks. The common name is cpu#_system_perc.value |
ed.host.cpu#.user.average | The metric ed.host.cpu#.user.average represents the average percentage of CPU time used by user processes on a particular host. This metric provides insights into how much CPU resources are being utilized by applications and services running in user mode. The common name is cpu#_user_avg.value |
ed.host.cpu#.user.percent | The metric ed.host.cpu#.user.percent measures the percentage of CPU spent on user processes on a specific host. This metric is valuable for monitoring how much CPU time is being consumed by user-level applications. The common name is cpu#_user_perc.value |
ed.host.disk.read_bytes | Measures the total number of bytes read from disk by the host. This metric provides insight into disk I/O activity The common name is disk_read_bytes.value |
ed.host.disk.write_bytes | Captures the total number of bytes written to disk on the host. This metric provides insights into disk write activities The common name is disk_write_bytes.value |
ed.host.memory.total | Represents the total amount of memory available on a host. This metric includes both physical and virtual memory The common name is memory_total.value |
ed.host.memory.used.percentage | Indicates the percentage of total memory currently in use on a host. This metric helps in assessing memory utilization and identifying potential memory resource constraints. The common name is memory_used_perc.value |
ed.host.memory.used.value | Tracks the amount of memory currently used on a host. This includes memory used by all processes and the operating system The common name is memory_used.value |
ed.host.net.read_bytes | Refers to the total number of bytes received over the network by the host. This provides insights into network activity concerning incoming data traffic. The common name is net_received_bytes.value |
ed.host.net.write_bytes | Represents the total number of bytes sent over the network by the host. This metric provides insights into network activity concerning outgoing data traffic. The common name is net_sent_bytes.value |
Edge Delta Pipeline Metrics
Metric Name | Description |
---|---|
ed.pipeline.l2m.log_threshold | Is related to monitoring log thresholds in a pipeline setting. This metric helps evaluate log counts across all agents The common name is log_threshold_monitor_metric |
ed.pipeline.node.read_bytes | Measures the total bytes read by a particular node in the pipeline. This provides insights into the data ingestion volume handled by that node. The common name is in_bytes |
ed.pipeline.node.read_items | Tracks the number of items or records read by a specific node within the pipeline. This provides insights into the volume of data processed by the node in terms of item count. The common name is in_items |
ed.pipeline.node.write_bytes | Records the total number of bytes written by a specific node in the pipeline. This metric helps track data egress handled by that node in terms of bytes outputted. The common name is out_bytes |
ed.pipeline.node.write_items | Tracks the number of items or records written by a specific node in the pipeline. This metric provides insight into the volume of data outputted by the node in terms of item count. The common name is out_items |
ed.pipeline.raw_write_bytes | Captures the total amount of raw byte data outputted by the pipeline. This metric provides insight into the volume of raw data transmitted. The common name is outgoing_raw_bytes.sum |
ed.pipeline.read_bytes | Measures the total amount of incoming bytes processed by the pipeline. This provides an overview of the data ingress in terms of byte volume. The common name is incoming_bytes.sum |
ed.pipeline.read_items | Indicates the number of lines processed by the pipeline. This metric helps in measuring data ingress in terms of line count. The common name is incoming_lines.count |
ed.pipeline.uncompressed_write_bytes | Measures the total number of uncompressed bytes outputted by the pipeline. This metric is used to assess the volume of data transmitted without any compression applied. The common name is outgoing_uncompressed_bytes.sum |
ed.pipeline.write_bytes | Tracks the total number of bytes written or outputted by the pipeline. This metric helps in understanding the total data egress in terms of byte volume. The common name is outgoing_bytes.sum |
ed.pipeline.write_items | Represents the count of lines or items that have been outputted by the pipeline. This metric is useful for understanding the data egress in terms of item count. The common name is outgoing_lines.count |
Kubernetes Metrics
By default, the Metrics Source node scrapes kube_state_metrics
. As of v1.27.0 kubelet
, cadvisor
, and node_exporter
metrics are excluded by default. You can remove them from the exclude
list if you want to include them.
cAdvisor Metrics
Metric Name | Description |
---|---|
k8s.container.blkio.device.usage.value | Indicates the total Block I/O utilization per device for containers over a period of time. The common name is container_blkio_device_usage_total |
k8s.container.cpu.cfs_periods.value | Denotes the number of control group CFS (Completely Fair Scheduler) periods consumed. The common name is container_cpu_cfs_periods_total` |
k8s.container.cpu.cfs_throttled_periods.value | Depicts the total CFS throttling periods indicating limits hit by CPU throttling. The common name is container_cpu_cfs_throttled_periods_total |
k8s.container.cpu.cfs_throttled_seconds.value | Displays the total seconds a container’s CPU usage was throttled denoting CPU limit enforcement by CFS. The common name is container_cpu_cfs_throttled_seconds_total |
k8s.container.cpu.load_average_10s.value | Reveals the Container CPU load average measured over a period of 10 seconds degree. The common name is container_cpu_load_average_10s |
k8s.container.cpu.schedstat_run_periods.value | Measures total kernel scheduling statistics run periods. The common name is container_cpu_schedstat_run_periods_total |
k8s.container.cpu.schedstat_run_seconds.value | Reports the total seconds of container execution in designated run states. The common name is container_cpu_schedstat_run_seconds_total |
k8s.container.cpu.schedstat_runqueue_seconds.value | Reports the total seconds spent on the run queue by container processes. The common name is container_cpu_schedstat_runqueue_seconds_total |
k8s.container.cpu.system_seconds.value | Accounts for the total system CPU time used by container processes in seconds. The common name is container_cpu_system_seconds_total |
k8s.container.cpu.usage_seconds.rate | Measures the rate of CPU seconds used by containers. This metric is crucial for monitoring container CPU usage within the Kubernetes environment. Previously known as ed_k8s_metric_container_cpu_usage_seconds.rate |
k8s.container.cpu.usage_seconds.value | Captures the total CPU time consumed by a container in seconds. This metric is crucial for monitoring CPU usage and performance of containers. Previously known as ed_k8s_metric_container_cpu_usage_seconds.value and the common name is container_cpu_usage_seconds_total |
k8s.container.cpu.user_seconds.value | Logs the total user CPU time consumed by container processes in seconds. The common name is container_cpu_user_seconds_total |
k8s.container.file_descriptors.value | Records the total number of open file descriptors used by processes in a container. The common name is container_file_descriptors |
k8s.container.fs.inodes_free.value | Illustrates the number of free inodes available for a container’s filesystem. The common name is container_fs_inodes_free |
k8s.container.fs.inodes.value | Represents the total inode capacity of a container’s filesystem. The common name is container_fs_inodes_total |
k8s.container.fs.io_current.value | Indicates the number of I/O operations being processed simultaneously for container’s filesystem. The common name is container_fs_io_current |
k8s.container.fs.io_time_seconds.value | Logs the total time spent carrying out I/O operations in seconds by the container’s filesystem. The common name is container_fs_io_time_seconds_total |
k8s.container.fs.io_time_weighted_seconds.value | Records the weighted time for I/O operations performed by the container’s filesystem in seconds. The common name is container_fs_io_time_weighted_seconds_total |
k8s.container.fs.limit_bytes.value | Specifies the total storage capacity available to a container’s filesystem in bytes. The common name is container_fs_limit_bytes |
k8s.container.fs.read_seconds.value | Accounts for the total seconds spent by the container performing read operations. The common name is container_fs_read_seconds_total |
k8s.container.fs.reads_bytes.value | Measures the total number of read bytes performed by the container. The common name is container_fs_reads_bytes_total |
k8s.container.fs.reads_merged.value | Records the total read requests merged into a single larger I/O request. The common name is container_fs_reads_merged_total |
k8s.container.fs.reads.value | Captures the total read operations performed by the container’s filesystem. The common name is container_fs_reads_total |
k8s.container.fs.sector_reads.value | Reports the total number of sectors read by a container’s filesystem. The common name is container_fs_sector_reads_total |
k8s.container.fs.sector_writes.value | Represents the total sectors written by a container’s filesystem. The common name is container_fs_sector_writes_total |
k8s.container.fs.usage_bytes.value | Denotes the total bytes used by a container’s filesystem. The common name is container_fs_usage_bytes |
k8s.container.fs.write_seconds.value | Logs the total time in seconds spent by write operations on the container’s filesystem. The common name is container_fs_write_seconds_total |
k8s.container.fs.writes_bytes.value | Indicates the total number of bytes written by the container’s filesystem. The common name is container_fs_writes_bytes_total |
k8s.container.fs.writes_merged.value | Counts the total write requests merged into a single larger I/O operation. The common name is container_fs_writes_merged_total |
k8s.container.fs.writes.value | Records the total write operations committed by the container’s filesystem. The common name is container_fs_writes_total |
k8s.container.hugetlb_failcnt.value | Tracks the fail count of hugetlb pages by the container indicating unsuccessful allocations. The common name is container_hugetlb_failcnt |
k8s.container.hugetlb_max_usage_bytes.value | Captures the peak huge pages memory used by the container in bytes. The common name is container_hugetlb_max_usage_bytes |
k8s.container.hugetlb_usage_bytes.value | Measures the current huge pages memory usage by the container in bytes. The common name is container_hugetlb_usage_bytes |
k8s.container.last_seen.value | Denotes the last recorded interaction or monitoring activity with the container. The common name is container_last_seen |
k8s.container.llc_occupancy_bytes.value | Reports the occupancy in bytes of the Last Level Cache (LLC) by the container processes. The common name is container_llc_occupancy_bytes |
k8s.container.memory.bandwidth_bytes.value | Refers to the total memory bandwidth utilized by a container measured in bytes. The common name is container_memory_bandwidth_bytes |
k8s.container.memory.bandwidth_local_bytes.value | Indicates the memory bandwidth used by the container within local NUMA nodes in bytes. The common name is container_memory_bandwidth_local_bytes |
k8s.container.memory.cache.value | Shows the cached memory in bytes used by the container which does not consume excessive swap space. The common name is container_memory_cache |
k8s.container.memory.failcnt.value | Logs the number of times memory allocations failed inside a container. The common name is container_memory_failcnt |
k8s.container.memory.failures.value | Logs the cumulative count of memory allocation failures for the container. The common name is container_memory_failures_total |
k8s.container.memory.mapped_file.value | Reports the memory occupied by files mapped into the container’s address space. The common name is container_memory_mapped_file |
k8s.container.memory.max_usage_bytes.value | Captures the maximum memory usage by a container in bytes during its lifecycle. The common name is container_memory_max_usage_bytes |
k8s.container.memory.migrate.value | Indicates the total pages migrated in containers due to memory imbalance. The common name is container_memory_migrate |
k8s.container.memory.numa_pages.value | Shows the number of NUMA pages in use for memory-intensive processes. The common name is container_memory_numa_pages |
k8s.container.memory.rss.value | Denotes the Resident Set Size representing the non-swappable physical memory consumed by a container. The common name is container_memory_rss |
k8s.container.memory.swap.value | Measures the swap space occupied by the container to store overflow data from its RAM. The common name is container_memory_swap |
k8s.container.memory.usage_bytes.value | Measures the memory usage in bytes by a container. This is essential for tracking container memory consumption within Kubernetes environments. Previously known as ed_k8s_metric_container_memory_usage_bytes.value and the common name is container_memory_usage_bytes |
k8s.container.memory.working_set_bytes.value | Represents the current memory work set of the container in bytes indicating actively used memory. The common name is container_memory_working_set_bytes |
k8s.container.network_advance_tcp_stats.value | Captures advanced TCP statistics for network analysis in container environments. The common name is container_network_advance_tcp_stats_total |
k8s.container.network.receive_bytes.rate | Measures the rate at which bytes are received over the network by a container. This metric helps to understand the network input activity for containers. Previously known as ed_k8s_metric_container_network_receive_bytes.rate |
k8s.container.network.receive_bytes.value | Measures the total bytes received over the network by a container. This metric is essential for monitoring network input traffic to containers. Previously known as ed_k8s_metric_container_network_receive_bytes.value and the common name is container_network_receive_bytes_total |
k8s.container.network.receive_errors.rate | Captures the rate of errors encountered while receiving network data by the container. This metric provides insights into the network reliability and error frequency for received data packets. Previously known as ed_k8s_metric_container_network_receive_errors.rate |
k8s.container.network.receive_errors.value | Tracks the total number of errors encountered while receiving network data by the container. This metric helps in evaluating the reliability of network traffic received. Previously known as ed_k8s_metric_container_network_receive_errors.value and the common name is container_network_receive_errors_total |
k8s.container.network.receive_packets_dropped.value | Logs the number of dropped inbound network packets in containers. The common name is container_network_receive_packets_dropped_total |
k8s.container.network.receive_packets.value | Tracks the total skb packets received by the container over its network interfaces. The common name is container_network_receive_packets_total |
k8s.container.network.tcp_usage.value | Records TCP usage statistics such as established connections within the container. The common name is container_network_tcp_usage_total |
k8s.container.network.tcp6_usage.value | Monitors the use of TCP over IPv6 in the container environment. The common name is container_network_tcp6_usage_total |
k8s.container.network.transmit_bytes.rate | Measures the rate at which bytes are transmitted from the container over the network. This helps monitor network output activity for containers. Previously known as ed_k8s_metric_container_network_transmit_bytes.rate |
k8s.container.network.transmit_bytes.value | Measures the total number of bytes transmitted from a container over the network. This metric is important for assessing the network output traffic of containers. Previously known as ed_k8s_metric_container_network_transmit_bytes.value and the common name is container_network_transmit_bytes_total |
k8s.container.network.transmit_errors.rate | Measures the rate of errors occurring when a container transmits data over the network. This metric helps in identifying reliability issues with container network transmissions. Previously known as ed_k8s_metric_container_network_transmit_errors.rate |
k8s.container.network.transmit_errors.value | Tracks the total number of errors encountered while transmitting network data from a container. This metric helps evaluate the reliability of network outputs. Previously known as ed_k8s_metric_container_network_transmit_errors.value and the common name is container_network_transmit_errors_total |
k8s.container.network.transmit_packets_dropped.value | Indicates the count of dropped outbound network packets. The common name is container_network_transmit_packets_dropped_total |
k8s.container.network.transmit_packets.value | Reflects the total number of packets transmitted from the container. The common name is container_network_transmit_packets_total |
k8s.container.network.udp_usage.value | Comprises the UDP protocol utilization data within container environments. The common name is container_network_udp_usage_total |
k8s.container.network.udp6_usage.value | Audits the usage of UDP over IPv6 networks inside containers. The common name is container_network_udp6_usage_total |
k8s.container.oom_events.value | Records the total number of Out-Of-Memory (OOM) events that a container has encountered. The common name is container_oom_events_total |
k8s.container.perf_events_scaling_ratio.value | Represents the scaling ratio applied to performance events to account for differences in hardware performance counters. The common name is container_perf_events_scaling_ratio |
k8s.container.perf_events.value | Captures the total performance events such as CPU cycles or instructions that have occurred in the container. The common name is container_perf_events_total |
k8s.container.perf_uncore_events_scaling_ratio.value | Indicates the scaling ratio for uncore performance events which are events associated with non-core parts like memory controllers or interconnects. The common name is container_perf_uncore_events_scaling_ratio |
k8s.container.perf_uncore_events.value | Counts the total uncore performance events which are activities measured by performance counters of the uncore components within a container. The common name is container_perf_uncore_events_total |
k8s.container.processes.value | Reflects the current number of processes running within a container. The common name is container_processes |
k8s.container.referenced_bytes.value | Shows the number of memory bytes referenced by the container that is actively being used by processes. The common name is container_referenced_bytes |
k8s.container.sockets.value | Measures the total number of socket connections currently open inside the container. The common name is container_sockets |
k8s.container.spec_cpu_period.value | Specifies the time period in microseconds for how regular CPU quota enforcement happens in the container. The common name is container_spec_cpu_period |
k8s.container.spec_cpu_quota.value | Defines the total allowed CPU time measured in microseconds per CPU period for a container. The common name is container_spec_cpu_quota |
k8s.container.spec_cpu_shares.value | Measures the CPU shares allocated to the container which affects its scheduling priority. The common name is container_spec_cpu_shares |
k8s.container.spec_memory_limit_bytes.value | Specifies the maximum memory limit set for a container in bytes. The common name is container_spec_memory_limit_bytes |
k8s.container.spec_memory_reservation_limit_bytes.value | Indicates the reserved memory limit for a container to ensure availability close to this resource allocation level in bytes. The common name is container_spec_memory_reservation_limit_bytes |
k8s.container.spec_memory_swap_limit_bytes.value | Denotes the swap memory limit configured for a container in bytes including RAM and swap space. The common name is container_spec_memory_swap_limit_bytes |
k8s.container.start_time_seconds.value | Marks the start time of the container in epoch seconds thus recording how long it has been running. The common name is container_start_time_seconds |
k8s.container.tasks_state.value | Shows the current state of task structures which are used for task scheduling operations inside the container. The common name is container_tasks_state |
k8s.container.threads_max.value | Highlights the maximum thread count that a container can execute based on resource limitations. The common name is container_threads_max |
k8s.container.threads.value | Indicates the current number of active threads operating inside a container reflecting concurrency. The common name is container_threads |
k8s.container.ulimits_soft.value | Refers to the soft ulimits that controls resource limits applied to the processes within a container such as open files user processes etc. The common name is container_ulimits_soft |
KSM Metrics
Metric Name | Description |
---|---|
k8s.ksm.cronjob_info.value | Indicates the presence of a Kubernetes cron job. This metric is crucial for counting the number of cron jobs defined within the Kubernetes cluster. Previously known as ed_k8s_metric_kube_cronjob_info.value |
k8s.ksm.daemonset_metadata_generation.value | Provides the generation number for a Kubernetes DaemonSet - which reflects the version of its desired state specification. This is useful for tracking updates and consistency within DaemonSet specifications. Previously known as ed_k8s_metric_kube_daemonset_metadata_generation.value |
k8s.ksm.deployment_metadata_generation.value | Indicates the metadata generation number for a Kubernetes Deployment. This metric helps track the version of the deployment’s configuration - which can be useful for detecting configuration changes and ensuring deployment consistency. Previously known as ed_k8s_metric_kube_deployment_metadata_generation.value |
k8s.ksm.deployment.status_replicas_available.value | Indicates the number of available replicas for a Kubernetes Deployment. This metric helps in monitoring the deployment to ensure the desired number of pods are up and running successfully. Previously known as ed_k8s_metric_kube_deployment_status_replicas_available.value |
k8s.ksm.job_info.value | Provides the count of Kubernetes jobs. This metric is important for tracking the number of jobs running in the Kubernetes cluster. Previously known as ed_k8s_metric_kube_job_info.value |
k8s.ksm.namespace.status_phase.value | Indicates the phase of a Kubernetes namespace. This metric helps in determining the operational status of namespaces within a Kubernetes cluster. Previously known as ed_k8s_metric_kube_namespace_status_phase.value |
k8s.ksm.node.info.value | Provides the count of Kubernetes nodes. This metric is useful for tracking the total number of nodes present in the Kubernetes cluster. Previously known as ed_k8s_metric_kube_node_info.value |
k8s.ksm.pod.container_resource_limits_cpu.value | Indicates the CPU resource limit set for a container within a Kubernetes pod. This metric helps in managing and enforcing the maximum CPU resources that the container can utilize. Previously known as ed_k8s_metric_kube_pod_container_resource_limits_cpu.value |
k8s.ksm.pod.container_resource_limits_memory.value | Indicates the memory resource limit set for a container within a Kubernetes pod. This metric helps in managing and enforcing the maximum memory resources that the container can utilize. Previously known as ed_k8s_metric_kube_pod_container_resource_limits_memory.value |
k8s.ksm.pod.container_resource_requests_cpu.value | Indicates the CPU resources requested for a container within a Kubernetes pod. This metric assists in monitoring and ensuring the minimum CPU resources that the container is expected to utilize are adequately set. Previously known as ed_k8s_metric_kube_pod_container_resource_requests_cpu.value |
k8s.ksm.pod.container_resource_requests_memory.value | Indicates the memory resources requested for a container within a Kubernetes pod. This metric is important for monitoring and ensuring the minimum memory resources expected for container operation are available. Previously known as ed_k8s_metric_kube_pod_container_resource_requests_memory.value |
k8s.ksm.pod.container_status_restarts.value | The metric `k8s.ksm.pod.container_status_restarts.value indicates the number of container restarts per container. |
k8s.ksm.pod.container_status_running.value | Indicates whether containers within Kubernetes pods are running. This metric helps ensure that the expected number of containers are actively running in the cluster. Previously known as ed_k8s_metric_kube_pod_container_status_running.value |
k8s.ksm.pod.container_status_terminated.value | Indicates whether containers within Kubernetes pods are terminated. This metric helps track the number of containers that have completed execution or stopped running in the cluster. Previously known as ed_k8s_metric_kube_pod_container_status_terminated.value |
k8s.ksm.pod.container_status_waiting.value | Indicates the number of containers within Kubernetes pods that are in a waiting state. This metric is important for understanding how many containers are unable to progress to a running state - which could suggest resource bottlenecks or configuration issues. Previously known as ed_k8s_metric_kube_pod_container_status_waiting.value |
k8s.ksm.pod.status_phase.value | Indicates the different phases of Kubernetes pods - which include Pending - Running - Succeeded - Failed - and Unknown. This metric is valuable for understanding the lifecycle stages of pods within the Kubernetes cluster. Previously known as ed_k8s_metric_kube_pod_status_phase.value |
k8s.ksm.statefulset_metadata_generation.value | Reflects the generation number of a Kubernetes StatefulSet’s metadata. This is used to track updates and changes to the StatefulSet’s configuration in the cluster. Previously known as ed_k8s_metric_kube_statefulset_metadata_generation.value |
Kubelet Metrics
Metric Name | Description |
---|---|
k8s.kubelet.active_pods.value | Measures the number of active pods that are currently registered and running on the kubelet in any node. The common name is kubelet_active_pods |
k8s.kubelet.admission_rejections.value | Tallies the number of pod admission rejections by the kubelet due to various policy or resource constraints. The common name is kubelet_admission_rejections_total |
k8s.kubelet.certificate_manager.client_expiration_renew_errors.value | Logs errors occurring during the renewal of client certificates by the kubelet’s certificate manager. The common name is kubelet_certificate_manager_client_expiration_renew_errors |
k8s.kubelet.certificate_manager.client_ttl_seconds.value | Measures the remaining lifetime of client certificates managed by the kubelet before they expire. The common name is kubelet_certificate_manager_client_ttl_seconds |
k8s.kubelet.certificate_manager.server_ttl_seconds.value | Captures the time-to-live for server certificates managed by the kubelet indicating when they need renewal. The common name is kubelet_certificate_manager_server_ttl_seconds |
k8s.kubelet.cgroup.manager_duration_seconds.value | Reports the duration of time spent by the kubelet’s cgroup manager in managing cgroups operations. The common name is kubelet_cgroup_manager_duration_seconds |
k8s.kubelet.cgroup.version.value | Denotes the version of the cgroup used by the kubelet for resource isolation and control. The common name is kubelet_cgroup_version |
k8s.kubelet.container.aligned_compute_resources.value | Represents the number of containers aligned to compute resource limits for efficient operation. The common name is kubelet_container_aligned_compute_resources_count |
k8s.kubelet.container.log_filesystem_used_bytes.value | Indicates the total bytes used by container log filesystem on the node managed by the kubelet. The common name is kubelet_container_log_filesystem_used_bytes |
k8s.kubelet.cpu_manager.exclusive_cpu_allocation.value | Counts the number of CPUs exclusively allocated to containers by the kubelet’s CPU manager. The common name is kubelet_cpu_manager_exclusive_cpu_allocation_count |
k8s.kubelet.cpu_manager.pinning_errors.value | Tracks errors encountered by the CPU manager within kubelet during CPU pinning activities. The common name is kubelet_cpu_manager_pinning_errors_total |
k8s.kubelet.cpu_manager.pinning_requests.value | Records the total number of CPU pinning requests managed by kubelet’s CPU manager. The common name is kubelet_cpu_manager_pinning_requests_total |
k8s.kubelet.cpu_manager.shared_pool_size_millicores.value | Denotes the size of the shared CPU pool in millicores as managed by the kubelet’s CPU manager. The common name is kubelet_cpu_manager_shared_pool_size_millicores |
k8s.kubelet.credential_provider_plugin_errors.value | Counts the number of errors encountered by the credential provider plugin within kubelet. The common name is kubelet_credential_provider_plugin_errors |
k8s.kubelet.desired_pods.value | Indicates the number of pods desired to be running on the kubelet as specified by the pod scheduler. The common name is kubelet_desired_pods |
k8s.kubelet.device_plugin.registration.value | Captures the total registrations of device plugins with kubelet indicating successful device discovery and resource advertisement. The common name is kubelet_device_plugin_registration_total |
k8s.kubelet.evented_pleg.connection_error.value | Logs the number of connection errors encountered by kubelet’s evented Pod Lifecycle Event Generator (PLEG). The common name is kubelet_evented_pleg_connection_error_count |
k8s.kubelet.evented_pleg.connection_success.value | Reports the number of successful connections made by kubelet’s evented PLEG. The common name is kubelet_evented_pleg_connection_success_count |
k8s.kubelet.evictions.value | Reflects the total number of pod evictions performed by kubelet for various reasons such as resource constraints or policy violations. The common name is kubelet_evictions |
k8s.kubelet.graceful_shutdown.end_time_seconds.value | Marks the timestamp when a graceful shutdown sequence concludes noting its duration. The common name is kubelet_graceful_shutdown_end_time_seconds |
k8s.kubelet.graceful_shutdown.start_time_seconds.value | Marks the timestamp when a graceful shutdown sequence starts indicating the onset of closing down procedures. The common name is kubelet_graceful_shutdown_start_time_seconds |
k8s.kubelet.http.inflight_requests.value | Measures the number of HTTP requests that are currently being processed by the kubelet. The common name is kubelet_http_inflight_requests |
k8s.kubelet.http.requests.value | Records the total number of HTTP requests made to the kubelet. The common name is kubelet_http_requests_total |
k8s.kubelet.image_garbage_collected.value | Indicates the total number of image garbage collections performed by the kubelet removing unused images to free space. The common name is kubelet_image_garbage_collected_total |
k8s.kubelet.lifecycle_handler_http_fallbacks.value | Counts the fallbacks to HTTP lifecycle handlers when gRPC handlers are not available. The common name is kubelet_lifecycle_handler_http_fallbacks_total |
k8s.kubelet.managed_ephemeral_containers.value | Shows the number of ephemeral containers managed by the kubelet designated for debugging. The common name is kubelet_managed_ephemeral_containers |
k8s.kubelet.memory_manager.pinning_errors.value | Logs the total memory pinning errors encountered by the memory manager within the kubelet. The common name is kubelet_memory_manager_pinning_errors_total |
k8s.kubelet.memory_manager.pinning_requests.value | Represents the number of memory pinning requests processed by the kubelet’s memory manager. The common name is kubelet_memory_manager_pinning_requests_total |
k8s.kubelet.mirror_pods.value | Refers to the count of mirror pods which are static pods mirrored and managed by the kubelet. The common name is kubelet_mirror_pods |
k8s.kubelet.node.name | Specifies the name of the node in the cluster as identified by the kubelet. The common name is kubelet_node_name |
k8s.kubelet.node.startup_duration_seconds.value | Tracks the duration in seconds taken for a node to startup and become operational. The common name is kubelet_node_startup_duration_seconds |
k8s.kubelet.node.startup_post_registration_duration_seconds.value | Measures the duration in seconds taken for post-registration startup activities on a node managed by the kubelet. The common name is kubelet_node_startup_post_registration_duration_seconds |
k8s.kubelet.node.startup_pre_kubelet_duration_seconds.value | Captures the time spent in pre-kubelet startup activities required before starting the kubelet. The common name is kubelet_node_startup_pre_kubelet_duration_seconds |
k8s.kubelet.node.startup_pre_registration_duration_seconds.value | Records the time in seconds allocated to pre-registration setup prior to a node registering with the cluster. The common name is kubelet_node_startup_pre_registration_duration_seconds |
k8s.kubelet.node.startup_registration_duration_seconds.value | Indicates the duration for the registration process of a node within a cluster managed by kubelet. The common name is kubelet_node_startup_registration_duration_seconds |
k8s.kubelet.orphan_pod.cleaned_volumes_errors.value | Logs errors encountered during the cleanup process of volumes associated with orphaned pods. The common name is kubelet_orphan_pod_cleaned_volumes_errors |
k8s.kubelet.orphan_pod.cleaned_volumes.value | Measures the number of volumes effectively cleaned that were left by orphaned pods. The common name is kubelet_orphan_pod_cleaned_volumes |
k8s.kubelet.orphan_pod.runtime_pods.value | Counts the total orphaned runtime pods found and handled by kubelet. The common name is kubelet_orphaned_runtime_pods_total |
k8s.kubelet.pleg.discard_events.value | Tracks the events discarded by the Pod Lifecycle Event Generator (PLEG) in the kubelet. The common name is kubelet_pleg_discard_events` |
k8s.kubelet.pleg.last_seen_seconds.value | Denotes the time in seconds since the last event was effectively seen by the PLEG. The common name is kubelet_pleg_last_seen_seconds |
k8s.kubelet.pod.resources_endpoint_errors_get_allocatable.value | Captures errors on requests made for getting allocatable resources via pod resource endpoint. The common name is kubelet_pod_resources_endpoint_errors_get_allocatable |
k8s.kubelet.pod.resources_endpoint_errors_get.value | Logs the number of errors encountered when the pod resource endpoint fails to return correctly requested data. The common name is kubelet_pod_resources_endpoint_errors_get |
k8s.kubelet.pod.resources_endpoint_errors_list.value | Captures errors during listing operations by the pod resources endpoint. The common name is kubelet_pod_resources_endpoint_errors_list |
k8s.kubelet.pod.resources_endpoint_requests_get_allocatable.value | Counts requests made for obtaining the allocatable resources via pod resources endpoint. The common name is kubelet_pod_resources_endpoint_requests_get_allocatable |
k8s.kubelet.pod.resources_endpoint_requests_get.value | Registers the total requests received at the pod resources endpoint when fetching specific resources. The common name is kubelet_pod_resources_endpoint_requests_get |
k8s.kubelet.pod.resources_endpoint_requests_list.value | Monitors the number of list requests received by the pod resources endpoint. The common name is kubelet_pod_resources_endpoint_requests_list |
k8s.kubelet.pod.resources_endpoint_requests.value | Represents the total number of requests handled by the pod resources endpoint of kubelet. The common name is kubelet_pod_resources_endpoint_requests_total |
k8s.kubelet.preemptions.value | Indicates incidences where the kubelet preemptively deschedules lower-priority pods to allocate resources for higher-priority ones. The common name is kubelet_preemptions |
k8s.kubelet.process.cpu_seconds.rate | Measures the rate of CPU time consumed by the Kubelet process. This metric is essential for monitoring the CPU utilization of the Kubelet service. Previously known as ed_k8s_metric_process_cpu_seconds.rate |
k8s.kubelet.process.cpu_seconds.value | Measures the total CPU seconds consumed by the Kubelet process. This metric provides insight into the overall CPU time that the Kubelet service has utilized. Previously known as ed_k8s_metric_process_cpu_seconds.value and the common name is process_cpu_seconds_total |
k8s.kubelet.process.resident_memory_bytes.value | Measures the resident memory bytes used by the Kubelet process. This provides insights into the current memory usage footprint of the Kubelet service. Previously known as ed_k8s_metric_process_resident_memory_bytes.value and the common name is process_resident_memory_bytes |
k8s.kubelet.rest_client.requests.rate | Captures the rate of requests made by the rest client in the Kubelet. This helps in monitoring the request load handled by the Kubelet. Previously known as ed_k8s_metric_rest_client_requests.rate |
k8s.kubelet.rest_client.requests.value | Represents the total number of requests made by the Kubelet’s REST client. This metric is useful for monitoring request activity within the Kubelet. Previously known as ed_k8s_metric_rest_client_requests.value and the common name is rest_client_requests_total |
k8s.kubelet.restarted_pods.value | Captures the total pod restarts initiated by the kubelet due to failure recovery or updates. The common name is kubelet_restarted_pods_total |
k8s.kubelet.running_containers.value | Represents the number of running containers managed by the Kubelet. This metric helps monitor the status and count of active containers under Kubelet’s management. Previously known as ed_k8s_metric_kubelet_running_containers.value |
k8s.kubelet.running_pods.value | Represents the count of pods currently running under the management of the Kubelet. This is useful for monitoring the operational status of pods in the Kubernetes environment. Previously known as ed_k8s_metric_kubelet_running_pods.value |
k8s.kubelet.runtime.operations_errors.rate | Measures the rate of errors in runtime operations handled by the Kubelet. This metric is essential for detecting and monitoring operational errors within Kubelet’s runtime processes. Previously known as ed_k8s_metric_kubelet_runtime_operations_errors.rate |
k8s.kubelet.runtime.operations_errors.value | Measures the total number of errors encountered in runtime operations by the Kubelet. This provides insights into the reliability and error frequency in Kubelet’s operations. Previously known as ed_k8s_metric_kubelet_runtime_operations_errors.value |
k8s.kubelet.runtime.operations.rate | Captures the rate of runtime operations handled by the Kubelet. This metric provides insights into the operational throughput managed by the Kubelet. Previously known as ed_k8s_metric_kubelet_runtime_operations.rate |
k8s.kubelet.runtime.operations.value | Represents the total number of runtime operations performed by the Kubelet. This provides insights into the activity level of Kubelet’s operations. Previously known as ed_k8s_metric_kubelet_runtime_operations.value |
k8s.kubelet.server_expiration_renew_errors.value | Records errors related to renewals of server’s expiring certificates managed by kubelet. The common name is kubelet_server_expiration_renew_errors |
k8s.kubelet.sleep_action_terminated_early.value | Measures the occurrences where sleep actions scheduled by kubelet are ended prematurely. The common name is kubelet_sleep_action_terminated_early_total |
k8s.kubelet.started_containers_errors.value | Tracks errors encountered while starting containers managed by the kubelet. The common name is kubelet_started_containers_errors_total |
k8s.kubelet.started_containers.value | Counts the total number of containers successfully started by the kubelet. The common name is kubelet_started_containers_total |
k8s.kubelet.started_host_process_containers_errors.value | Logs the number of errors occurring in starting host process containers. The common name is kubelet_started_host_process_containers_errors_total |
k8s.kubelet.started_host_process_containers.value | Measures the total of host process containers started by the kubelet. The common name is kubelet_started_host_process_containers_total |
k8s.kubelet.started_pods_errors.value | Indicates error instances during the starting of pods by the kubelet. The common name is kubelet_started_pods_errors_total |
k8s.kubelet.started_pods.value | Captures the total count of pods that were successfully initiated by the kubelet. The common name is kubelet_started_pods_total |
k8s.kubelet.topology_manager.admission_errors.value | Logs the number of times resource admission requests failed due to topology constraints. The common name is kubelet_topology_manager_admission_errors_total |
k8s.kubelet.topology_manager.admission_requests.value | Captures the total requests made for resource admission judged by topology manager policies within the kubelet. The common name is kubelet_topology_manager_admission_requests_total |
k8s.kubelet.volume_manager.total_volumes.value | Indicates the total number of volumes managed by the Kubelet’s volume manager. This metric is important for monitoring storage utilization and volume management within the Kubernetes environment. Previously known as ed_k8s_metric_volume_manager_total_volumes.value and the common name is volume_manager_total_volumes |
k8s.kubelet.working_pods.value | Signifies the number of pods that are actively being processed by the kubelet on a node. The common name is kubelet_working_pods |
Node Exporter Metrics
Metric Name | Description |
---|---|
k8s.node.cpu.seconds.rate | Measures the rate of CPU seconds used by the node. This is important for tracking the CPU usage across the entire node in Kubernetes. Previously known as ed_k8s_metric_node_cpu_seconds.rate |
k8s.node.cpu.seconds.value | Measures the total CPU time consumed by the node - expressed in seconds. This metric is essential for understanding the overall CPU resource usage of the node. Previously known as ed_k8s_metric_node_cpu_seconds.value and the common name is node_cpu_seconds_total |
k8s.node.disk.read_bytes.value | Accounts for the total number of bytes read from disks on the node. The common name is node_disk_read_bytes_total |
k8s.node.disk.reads_completed.value | Records the total number of disk read operations completed on a node. The common name is node_disk_reads_completed_total |
k8s.node.disk.writes_completed.value | Captures the total number of disk write operations concluded on a node. The common name is node_disk_writes_completed_total |
k8s.node.disk.written_bytes.value | Shows the total bytes written to disks on the node. The common name is node_disk_written_bytes_total |
k8s.node.filesystem.avail_bytes.value | Tracks the available filesystem bytes on a node. This metric indicates how much storage space is left for use. Previously known as ed_k8s_metric_node_filesystem_avail_bytes.value and the common name is node_filesystem_avail_bytes |
k8s.node.filesystem.free_bytes.value | Expresses the amount of free space in bytes available on the node’s filesystem. The common name is node_filesystem_free_bytes |
k8s.node.filesystem.size_bytes.value | Measures the total size of the filesystem in bytes on a node. This provides insights into the total storage capacity available on the node. Previously known as ed_k8s_metric_node_filesystem_size_bytes.value and the common name is node_filesystem_size_bytes |
k8s.node.load.15min.value | Measures the 15-minute average load on a Kubernetes node. This metric provides insight into the node’s load and processing demand over a 15-minute interval - which can help identify periods of high computational demand or average system load in the context of Kubernetes clusters. Previously known as ed_k8s_metric_node_load15.value and the common name is node_load15 |
k8s.node.load.1min.value | Provides the node’s one-minute load average indicating average system activity or CPU demand. The common name is node_load1 |
k8s.node.load.5min.value | Displays the five-minute load average on a node reflecting a smoother CPU demand or system activity over time. The common name is node_load5 |
k8s.node.memory.available_bytes.value | Tracks the amount of available memory on a Kubernetes node in bytes. This metric is vital for monitoring memory availability - ensuring that sufficient memory resources are allocated for applications running on the node. Previously known as ed_k8s_metric_node_memory_mem_available_bytes.value and the common name is node_memory_MemAvailable_bytes |
k8s.node.memory.buffers_bytes.value | Calculates the memory being used by kernel buffers on the node. The common name is node_memory_Buffers_bytes |
k8s.node.memory.cached_bytes.value | Sums up the memory capacity utilized for caching filesystem metadata and file data on the node. The common name is node_memory_Cached_bytes |
k8s.node.memory.free_bytes.value | Notes down the amount of memory in bytes that is free and available for use on a node. The common name is node_memory_MemFree_bytes |
k8s.node.memory.total_bytes.value | Measures the total memory capacity of a Kubernetes node in bytes. This metric is crucial for understanding the physical memory resources available on the node - helping to ensure proper memory management and allocation across the cluster. Previously known as ed_k8s_metric_node_memory_mem_total_bytes.value and the common name is node_memory_MemTotal_bytes |
k8s.node.network.receive_bytes.rate | Measures the rate at which bytes are received on a network interface of a Kubernetes node. This metric is essential for monitoring network traffic and understanding network bandwidth utilization to ensure the node’s network capacity is not exceeded. Previously known as ed_k8s_metric_node_network_receive_bytes.rate |
k8s.node.network.receive_bytes.value | Represents the total bytes received on a network interface of a Kubernetes node. This metric is important for evaluating the overall network traffic and bandwidth usage on the node - which helps in managing network performance and capacity planning. Previously known as ed_k8s_metric_node_network_receive_bytes.value and the common name is node_network_receive_bytes_total |
k8s.node.network.receive_drop.value | Logs the total dropped incoming network packets on all interfaces of a node. The common name is node_network_receive_drop_total |
k8s.node.network.receive_packets.value | Captures the total number of incoming network packets received on all interfaces of a node. The common name is node_network_receive_packets_total |
k8s.node.network.transmit_bytes.rate | Measures the rate at which bytes are transmitted on a network interface of a Kubernetes node. This metric is crucial for understanding outbound network traffic and helps in monitoring network performance to ensure sufficient network capacity. Previously known as ed_k8s_metric_node_network_transmit_bytes.rate |
k8s.node.network.transmit_bytes.value | Represents the total bytes transmitted from a network interface of a Kubernetes node. This metric is important for assessing the total outbound network traffic from the node - aiding in the management of network throughput and bandwidth usage. Previously known as ed_k8s_metric_node_network_transmit_bytes.value and the common name is node_network_transmit_bytes_total |
k8s.node.network.transmit_drop.value | Logs the total number of network packets that were dropped during transmission on the node’s interfaces. The common name is node_network_transmit_drop_total |
k8s.node.network.transmit_packets.value | Records the total outgoing network packets transmitted by all interfaces on a node. The common name is node_network_transmit_packets_total |
Traffic Metrics
Metric Name | Description |
---|---|
k8s.traffic.communication.count | Provides a count of network communications or transactions occurring within a Kubernetes environment. This metric helps in tracking the volume of communication traffic between Kubernetes components. Previously known as ed_k8s_traffic_communication.count |
k8s.traffic.communication.latency.avg | Measures the average latency of communication events occurring within a Kubernetes environment. This metric is crucial for analyzing the performance and responsiveness of inter-component communications. Previously known as ed_k8s_traffic_latency.avg |
k8s.traffic.communication.latency.p95 | Measures the 95th percentile communication latency within a Kubernetes environment. This metric is useful for analyzing tail-end latency and identifying potential bottlenecks or delays affecting a small percentage of communication interactions. Previously known as ed_k8s_traffic_latency.p95 |
k8s.traffic.communication.read_bytes.sum | Measures the total bytes read or inbound traffic within a Kubernetes environment. This metric is important for assessing the volume of incoming network data and understanding network usage patterns. Previously known as ed_k8s_traffic_in.sum |
k8s.traffic.communication.write_bytes.sum | Measures the total bytes written or outbound traffic within a Kubernetes environment. This metric is important for assessing the volume of outgoing network data and understanding network usage patterns. Previously known as ed_k8s_traffic_out.sum |