Edge Delta Metrics List

Metrics handled by Edge Delta.

This page lists metrics in Edge Delta. For accurate interpretation and pipeline design you should examine your actual data using the metrics inventory, node tests, and the debug node.

Edge Delta Agent Metrics

Metric Name Description
ed.agent.cpu.milicores Measures the CPU usage of the Edge Delta agent in millicores. This metric is useful for understanding how much CPU is being consumed by the agent - which can help identify performance bottlenecks or inefficiencies. The common name is agent_cpu_millicores.value
ed.agent.gc.count Represents the total number of garbage collection operations carried out by the Edge Delta agent. This metric helps in understanding the frequency of garbage collection processes. The common name is agent_gc_count.value
ed.agent.gc.forced_count Indicates the number of times garbage collection was manually triggered within the Edge Delta agent. This metric is useful for tracking how often forced garbage collection occurs. The common name is agent_gc_forced_count.value
ed.agent.gc.pause_time Refers to the duration of garbage collection pauses in milliseconds. This metric is useful for diagnosing performance issues related to the time it takes the garbage collector to complete its cycle within the Edge Delta agent. The common name is agent_gc_pause_ms.value
ed.agent.gc.target The garbage collection target of the Edge Delta agent. This metric helps in monitoring and managing the garbage collection process within the agent. The common name is agent_gc_target.value
ed.agent.go.routine.value Tracks the number of goroutines in use by the Edge Delta agent. This metric is useful for understanding how many concurrent operations or threads the agent is handling. which can indicate how efficiently the agent is managing memory. The common name is agent_num_goroutine.value.
ed.agent.memory.allocation The memory allocation of the Edge Delta agent. This metric is used to monitor the amount of memory that the agent is currently using. which can be indicative of memory management strategies or issues. The common name is agent_mem_alloc.value
ed.agent.memory.to_be_freed The memory that is marked to be freed by the Edge Delta agent. This metric helps in understanding how much memory is expected to be released soon by the agent. The common name is agent_mem_to_be_freed.value
ed.agent.memory.virtual The memory that is reserved for the Edge Delta agent - including both the physical and swap memory that has been allocated. It provides insights into the total address space reserved by the agent. The common name is agent_mem_virtual.value
ed.host.cpu.process_count Tracks the number of processes currently running on the host. This provides insights into the system’s load and can indicate resource usage trends. The common name is process_count.value
ed.host.cpu.system.average Tracks the average percentage of CPU time consumed by system processes across all CPUs on a host. This offers an overview of the CPU resources used for system-level tasks over time. The common name is cpu_system_avg.value
ed.host.cpu.user.average Measures the average percentage of CPU time used by user-level processes across all CPUs on a host. This metric provides insights into overall CPU usage by applications running in user mode. The common name is cpu_user_avg.value
ed.host.cpu#.system.average The metric ed.host.cpu#.system.average represents the average percentage of CPU time spent on system processes on a specific host. This metric provides insights into how much CPU resources are allocated for operating system tasks over a period of time. The common name is cpu#_system_avg.value
ed.host.cpu#.system.percent The metric ed.host.cpu#.system.percent measures the percentage of CPU time spent on system processes on a specific host. This metric helps in understanding the proportion of CPU resources used by the operating system for various internal tasks. The common name is cpu#_system_perc.value
ed.host.cpu#.user.average The metric ed.host.cpu#.user.average represents the average percentage of CPU time used by user processes on a particular host. This metric provides insights into how much CPU resources are being utilized by applications and services running in user mode. The common name is cpu#_user_avg.value
ed.host.cpu#.user.percent The metric ed.host.cpu#.user.percent measures the percentage of CPU spent on user processes on a specific host. This metric is valuable for monitoring how much CPU time is being consumed by user-level applications. The common name is cpu#_user_perc.value
ed.host.disk.read_bytes Measures the total number of bytes read from disk by the host. This metric provides insight into disk I/O activity The common name is disk_read_bytes.value
ed.host.disk.write_bytes Captures the total number of bytes written to disk on the host. This metric provides insights into disk write activities The common name is disk_write_bytes.value
ed.host.memory.total Represents the total amount of memory available on a host. This metric includes both physical and virtual memory The common name is memory_total.value
ed.host.memory.used.percentage Indicates the percentage of total memory currently in use on a host. This metric helps in assessing memory utilization and identifying potential memory resource constraints. The common name is memory_used_perc.value
ed.host.memory.used.value Tracks the amount of memory currently used on a host. This includes memory used by all processes and the operating system The common name is memory_used.value
ed.host.net.read_bytes Refers to the total number of bytes received over the network by the host. This provides insights into network activity concerning incoming data traffic. The common name is net_received_bytes.value
ed.host.net.write_bytes Represents the total number of bytes sent over the network by the host. This metric provides insights into network activity concerning outgoing data traffic. The common name is net_sent_bytes.value

Edge Delta Pipeline Metrics

Metric Name Description
ed.pipeline.l2m.log_threshold Is related to monitoring log thresholds in a pipeline setting. This metric helps evaluate log counts across all agents The common name is log_threshold_monitor_metric
ed.pipeline.node.read_bytes Measures the total bytes read by a particular node in the pipeline. This provides insights into the data ingestion volume handled by that node. The common name is in_bytes
ed.pipeline.node.read_items Tracks the number of items or records read by a specific node within the pipeline. This provides insights into the volume of data processed by the node in terms of item count. The common name is in_items
ed.pipeline.node.write_bytes Records the total number of bytes written by a specific node in the pipeline. This metric helps track data egress handled by that node in terms of bytes outputted. The common name is out_bytes
ed.pipeline.node.write_items Tracks the number of items or records written by a specific node in the pipeline. This metric provides insight into the volume of data outputted by the node in terms of item count. The common name is out_items
ed.pipeline.raw_write_bytes Captures the total amount of raw byte data outputted by the pipeline. This metric provides insight into the volume of raw data transmitted. The common name is outgoing_raw_bytes.sum
ed.pipeline.read_bytes Measures the total amount of incoming bytes processed by the pipeline. This provides an overview of the data ingress in terms of byte volume. The common name is incoming_bytes.sum
ed.pipeline.read_items Indicates the number of lines processed by the pipeline. This metric helps in measuring data ingress in terms of line count. The common name is incoming_lines.count
ed.pipeline.uncompressed_write_bytes Measures the total number of uncompressed bytes outputted by the pipeline. This metric is used to assess the volume of data transmitted without any compression applied. The common name is outgoing_uncompressed_bytes.sum
ed.pipeline.write_bytes Tracks the total number of bytes written or outputted by the pipeline. This metric helps in understanding the total data egress in terms of byte volume. The common name is outgoing_bytes.sum
ed.pipeline.write_items Represents the count of lines or items that have been outputted by the pipeline. This metric is useful for understanding the data egress in terms of item count. The common name is outgoing_lines.count

Kubernetes Metrics

By default, the Metrics Source node scrapes kube_state_metrics. As of v1.27.0 kubelet, cadvisor, and node_exporter metrics are excluded by default. You can remove them from the exclude list if you want to include them.

cAdvisor Metrics

Metric Name Description
k8s.container.blkio.device.usage.value Indicates the total Block I/O utilization per device for containers over a period of time. The common name is container_blkio_device_usage_total
k8s.container.cpu.cfs_periods.value Denotes the number of control group CFS (Completely Fair Scheduler) periods consumed. The common name is container_cpu_cfs_periods_total`
k8s.container.cpu.cfs_throttled_periods.value Depicts the total CFS throttling periods indicating limits hit by CPU throttling. The common name is container_cpu_cfs_throttled_periods_total
k8s.container.cpu.cfs_throttled_seconds.value Displays the total seconds a container’s CPU usage was throttled denoting CPU limit enforcement by CFS. The common name is container_cpu_cfs_throttled_seconds_total
k8s.container.cpu.load_average_10s.value Reveals the Container CPU load average measured over a period of 10 seconds degree. The common name is container_cpu_load_average_10s
k8s.container.cpu.schedstat_run_periods.value Measures total kernel scheduling statistics run periods. The common name is container_cpu_schedstat_run_periods_total
k8s.container.cpu.schedstat_run_seconds.value Reports the total seconds of container execution in designated run states. The common name is container_cpu_schedstat_run_seconds_total
k8s.container.cpu.schedstat_runqueue_seconds.value Reports the total seconds spent on the run queue by container processes. The common name is container_cpu_schedstat_runqueue_seconds_total
k8s.container.cpu.system_seconds.value Accounts for the total system CPU time used by container processes in seconds. The common name is container_cpu_system_seconds_total
k8s.container.cpu.usage_seconds.rate Measures the rate of CPU seconds used by containers. This metric is crucial for monitoring container CPU usage within the Kubernetes environment. Previously known as ed_k8s_metric_container_cpu_usage_seconds.rate
k8s.container.cpu.usage_seconds.value Captures the total CPU time consumed by a container in seconds. This metric is crucial for monitoring CPU usage and performance of containers. Previously known as ed_k8s_metric_container_cpu_usage_seconds.value and the common name is container_cpu_usage_seconds_total
k8s.container.cpu.user_seconds.value Logs the total user CPU time consumed by container processes in seconds. The common name is container_cpu_user_seconds_total
k8s.container.file_descriptors.value Records the total number of open file descriptors used by processes in a container. The common name is container_file_descriptors
k8s.container.fs.inodes_free.value Illustrates the number of free inodes available for a container’s filesystem. The common name is container_fs_inodes_free
k8s.container.fs.inodes.value Represents the total inode capacity of a container’s filesystem. The common name is container_fs_inodes_total
k8s.container.fs.io_current.value Indicates the number of I/O operations being processed simultaneously for container’s filesystem. The common name is container_fs_io_current
k8s.container.fs.io_time_seconds.value Logs the total time spent carrying out I/O operations in seconds by the container’s filesystem. The common name is container_fs_io_time_seconds_total
k8s.container.fs.io_time_weighted_seconds.value Records the weighted time for I/O operations performed by the container’s filesystem in seconds. The common name is container_fs_io_time_weighted_seconds_total
k8s.container.fs.limit_bytes.value Specifies the total storage capacity available to a container’s filesystem in bytes. The common name is container_fs_limit_bytes
k8s.container.fs.read_seconds.value Accounts for the total seconds spent by the container performing read operations. The common name is container_fs_read_seconds_total
k8s.container.fs.reads_bytes.value Measures the total number of read bytes performed by the container. The common name is container_fs_reads_bytes_total
k8s.container.fs.reads_merged.value Records the total read requests merged into a single larger I/O request. The common name is container_fs_reads_merged_total
k8s.container.fs.reads.value Captures the total read operations performed by the container’s filesystem. The common name is container_fs_reads_total
k8s.container.fs.sector_reads.value Reports the total number of sectors read by a container’s filesystem. The common name is container_fs_sector_reads_total
k8s.container.fs.sector_writes.value Represents the total sectors written by a container’s filesystem. The common name is container_fs_sector_writes_total
k8s.container.fs.usage_bytes.value Denotes the total bytes used by a container’s filesystem. The common name is container_fs_usage_bytes
k8s.container.fs.write_seconds.value Logs the total time in seconds spent by write operations on the container’s filesystem. The common name is container_fs_write_seconds_total
k8s.container.fs.writes_bytes.value Indicates the total number of bytes written by the container’s filesystem. The common name is container_fs_writes_bytes_total
k8s.container.fs.writes_merged.value Counts the total write requests merged into a single larger I/O operation. The common name is container_fs_writes_merged_total
k8s.container.fs.writes.value Records the total write operations committed by the container’s filesystem. The common name is container_fs_writes_total
k8s.container.hugetlb_failcnt.value Tracks the fail count of hugetlb pages by the container indicating unsuccessful allocations. The common name is container_hugetlb_failcnt
k8s.container.hugetlb_max_usage_bytes.value Captures the peak huge pages memory used by the container in bytes. The common name is container_hugetlb_max_usage_bytes
k8s.container.hugetlb_usage_bytes.value Measures the current huge pages memory usage by the container in bytes. The common name is container_hugetlb_usage_bytes
k8s.container.last_seen.value Denotes the last recorded interaction or monitoring activity with the container. The common name is container_last_seen
k8s.container.llc_occupancy_bytes.value Reports the occupancy in bytes of the Last Level Cache (LLC) by the container processes. The common name is container_llc_occupancy_bytes
k8s.container.memory.bandwidth_bytes.value Refers to the total memory bandwidth utilized by a container measured in bytes. The common name is container_memory_bandwidth_bytes
k8s.container.memory.bandwidth_local_bytes.value Indicates the memory bandwidth used by the container within local NUMA nodes in bytes. The common name is container_memory_bandwidth_local_bytes
k8s.container.memory.cache.value Shows the cached memory in bytes used by the container which does not consume excessive swap space. The common name is container_memory_cache
k8s.container.memory.failcnt.value Logs the number of times memory allocations failed inside a container. The common name is container_memory_failcnt
k8s.container.memory.failures.value Logs the cumulative count of memory allocation failures for the container. The common name is container_memory_failures_total
k8s.container.memory.mapped_file.value Reports the memory occupied by files mapped into the container’s address space. The common name is container_memory_mapped_file
k8s.container.memory.max_usage_bytes.value Captures the maximum memory usage by a container in bytes during its lifecycle. The common name is container_memory_max_usage_bytes
k8s.container.memory.migrate.value Indicates the total pages migrated in containers due to memory imbalance. The common name is container_memory_migrate
k8s.container.memory.numa_pages.value Shows the number of NUMA pages in use for memory-intensive processes. The common name is container_memory_numa_pages
k8s.container.memory.rss.value Denotes the Resident Set Size representing the non-swappable physical memory consumed by a container. The common name is container_memory_rss
k8s.container.memory.swap.value Measures the swap space occupied by the container to store overflow data from its RAM. The common name is container_memory_swap
k8s.container.memory.usage_bytes.value Measures the memory usage in bytes by a container. This is essential for tracking container memory consumption within Kubernetes environments. Previously known as ed_k8s_metric_container_memory_usage_bytes.value and the common name is container_memory_usage_bytes
k8s.container.memory.working_set_bytes.value Represents the current memory work set of the container in bytes indicating actively used memory. The common name is container_memory_working_set_bytes
k8s.container.network_advance_tcp_stats.value Captures advanced TCP statistics for network analysis in container environments. The common name is container_network_advance_tcp_stats_total
k8s.container.network.receive_bytes.rate Measures the rate at which bytes are received over the network by a container. This metric helps to understand the network input activity for containers. Previously known as ed_k8s_metric_container_network_receive_bytes.rate
k8s.container.network.receive_bytes.value Measures the total bytes received over the network by a container. This metric is essential for monitoring network input traffic to containers. Previously known as ed_k8s_metric_container_network_receive_bytes.value and the common name is container_network_receive_bytes_total
k8s.container.network.receive_errors.rate Captures the rate of errors encountered while receiving network data by the container. This metric provides insights into the network reliability and error frequency for received data packets. Previously known as ed_k8s_metric_container_network_receive_errors.rate
k8s.container.network.receive_errors.value Tracks the total number of errors encountered while receiving network data by the container. This metric helps in evaluating the reliability of network traffic received. Previously known as ed_k8s_metric_container_network_receive_errors.value and the common name is container_network_receive_errors_total
k8s.container.network.receive_packets_dropped.value Logs the number of dropped inbound network packets in containers. The common name is container_network_receive_packets_dropped_total
k8s.container.network.receive_packets.value Tracks the total skb packets received by the container over its network interfaces. The common name is container_network_receive_packets_total
k8s.container.network.tcp_usage.value Records TCP usage statistics such as established connections within the container. The common name is container_network_tcp_usage_total
k8s.container.network.tcp6_usage.value Monitors the use of TCP over IPv6 in the container environment. The common name is container_network_tcp6_usage_total
k8s.container.network.transmit_bytes.rate Measures the rate at which bytes are transmitted from the container over the network. This helps monitor network output activity for containers. Previously known as ed_k8s_metric_container_network_transmit_bytes.rate
k8s.container.network.transmit_bytes.value Measures the total number of bytes transmitted from a container over the network. This metric is important for assessing the network output traffic of containers. Previously known as ed_k8s_metric_container_network_transmit_bytes.value and the common name is container_network_transmit_bytes_total
k8s.container.network.transmit_errors.rate Measures the rate of errors occurring when a container transmits data over the network. This metric helps in identifying reliability issues with container network transmissions. Previously known as ed_k8s_metric_container_network_transmit_errors.rate
k8s.container.network.transmit_errors.value Tracks the total number of errors encountered while transmitting network data from a container. This metric helps evaluate the reliability of network outputs. Previously known as ed_k8s_metric_container_network_transmit_errors.value and the common name is container_network_transmit_errors_total
k8s.container.network.transmit_packets_dropped.value Indicates the count of dropped outbound network packets. The common name is container_network_transmit_packets_dropped_total
k8s.container.network.transmit_packets.value Reflects the total number of packets transmitted from the container. The common name is container_network_transmit_packets_total
k8s.container.network.udp_usage.value Comprises the UDP protocol utilization data within container environments. The common name is container_network_udp_usage_total
k8s.container.network.udp6_usage.value Audits the usage of UDP over IPv6 networks inside containers. The common name is container_network_udp6_usage_total
k8s.container.oom_events.value Records the total number of Out-Of-Memory (OOM) events that a container has encountered. The common name is container_oom_events_total
k8s.container.perf_events_scaling_ratio.value Represents the scaling ratio applied to performance events to account for differences in hardware performance counters. The common name is container_perf_events_scaling_ratio
k8s.container.perf_events.value Captures the total performance events such as CPU cycles or instructions that have occurred in the container. The common name is container_perf_events_total
k8s.container.perf_uncore_events_scaling_ratio.value Indicates the scaling ratio for uncore performance events which are events associated with non-core parts like memory controllers or interconnects. The common name is container_perf_uncore_events_scaling_ratio
k8s.container.perf_uncore_events.value Counts the total uncore performance events which are activities measured by performance counters of the uncore components within a container. The common name is container_perf_uncore_events_total
k8s.container.processes.value Reflects the current number of processes running within a container. The common name is container_processes
k8s.container.referenced_bytes.value Shows the number of memory bytes referenced by the container that is actively being used by processes. The common name is container_referenced_bytes
k8s.container.sockets.value Measures the total number of socket connections currently open inside the container. The common name is container_sockets
k8s.container.spec_cpu_period.value Specifies the time period in microseconds for how regular CPU quota enforcement happens in the container. The common name is container_spec_cpu_period
k8s.container.spec_cpu_quota.value Defines the total allowed CPU time measured in microseconds per CPU period for a container. The common name is container_spec_cpu_quota
k8s.container.spec_cpu_shares.value Measures the CPU shares allocated to the container which affects its scheduling priority. The common name is container_spec_cpu_shares
k8s.container.spec_memory_limit_bytes.value Specifies the maximum memory limit set for a container in bytes. The common name is container_spec_memory_limit_bytes
k8s.container.spec_memory_reservation_limit_bytes.value Indicates the reserved memory limit for a container to ensure availability close to this resource allocation level in bytes. The common name is container_spec_memory_reservation_limit_bytes
k8s.container.spec_memory_swap_limit_bytes.value Denotes the swap memory limit configured for a container in bytes including RAM and swap space. The common name is container_spec_memory_swap_limit_bytes
k8s.container.start_time_seconds.value Marks the start time of the container in epoch seconds thus recording how long it has been running. The common name is container_start_time_seconds
k8s.container.tasks_state.value Shows the current state of task structures which are used for task scheduling operations inside the container. The common name is container_tasks_state
k8s.container.threads_max.value Highlights the maximum thread count that a container can execute based on resource limitations. The common name is container_threads_max
k8s.container.threads.value Indicates the current number of active threads operating inside a container reflecting concurrency. The common name is container_threads
k8s.container.ulimits_soft.value Refers to the soft ulimits that controls resource limits applied to the processes within a container such as open files user processes etc. The common name is container_ulimits_soft

KSM Metrics

Metric Name Description
k8s.ksm.cronjob_info.value Indicates the presence of a Kubernetes cron job. This metric is crucial for counting the number of cron jobs defined within the Kubernetes cluster. Previously known as ed_k8s_metric_kube_cronjob_info.value
k8s.ksm.daemonset_metadata_generation.value Provides the generation number for a Kubernetes DaemonSet - which reflects the version of its desired state specification. This is useful for tracking updates and consistency within DaemonSet specifications. Previously known as ed_k8s_metric_kube_daemonset_metadata_generation.value
k8s.ksm.deployment_metadata_generation.value Indicates the metadata generation number for a Kubernetes Deployment. This metric helps track the version of the deployment’s configuration - which can be useful for detecting configuration changes and ensuring deployment consistency. Previously known as ed_k8s_metric_kube_deployment_metadata_generation.value
k8s.ksm.deployment.status_replicas_available.value Indicates the number of available replicas for a Kubernetes Deployment. This metric helps in monitoring the deployment to ensure the desired number of pods are up and running successfully. Previously known as ed_k8s_metric_kube_deployment_status_replicas_available.value
k8s.ksm.job_info.value Provides the count of Kubernetes jobs. This metric is important for tracking the number of jobs running in the Kubernetes cluster. Previously known as ed_k8s_metric_kube_job_info.value
k8s.ksm.namespace.status_phase.value Indicates the phase of a Kubernetes namespace. This metric helps in determining the operational status of namespaces within a Kubernetes cluster. Previously known as ed_k8s_metric_kube_namespace_status_phase.value
k8s.ksm.node.info.value Provides the count of Kubernetes nodes. This metric is useful for tracking the total number of nodes present in the Kubernetes cluster. Previously known as ed_k8s_metric_kube_node_info.value
k8s.ksm.pod.container_resource_limits_cpu.value Indicates the CPU resource limit set for a container within a Kubernetes pod. This metric helps in managing and enforcing the maximum CPU resources that the container can utilize. Previously known as ed_k8s_metric_kube_pod_container_resource_limits_cpu.value
k8s.ksm.pod.container_resource_limits_memory.value Indicates the memory resource limit set for a container within a Kubernetes pod. This metric helps in managing and enforcing the maximum memory resources that the container can utilize. Previously known as ed_k8s_metric_kube_pod_container_resource_limits_memory.value
k8s.ksm.pod.container_resource_requests_cpu.value Indicates the CPU resources requested for a container within a Kubernetes pod. This metric assists in monitoring and ensuring the minimum CPU resources that the container is expected to utilize are adequately set. Previously known as ed_k8s_metric_kube_pod_container_resource_requests_cpu.value
k8s.ksm.pod.container_resource_requests_memory.value Indicates the memory resources requested for a container within a Kubernetes pod. This metric is important for monitoring and ensuring the minimum memory resources expected for container operation are available. Previously known as ed_k8s_metric_kube_pod_container_resource_requests_memory.value
k8s.ksm.pod.container_status_restarts.value The metric `k8s.ksm.pod.container_status_restarts.value indicates the number of container restarts per container.
k8s.ksm.pod.container_status_running.value Indicates whether containers within Kubernetes pods are running. This metric helps ensure that the expected number of containers are actively running in the cluster. Previously known as ed_k8s_metric_kube_pod_container_status_running.value
k8s.ksm.pod.container_status_terminated.value Indicates whether containers within Kubernetes pods are terminated. This metric helps track the number of containers that have completed execution or stopped running in the cluster. Previously known as ed_k8s_metric_kube_pod_container_status_terminated.value
k8s.ksm.pod.container_status_waiting.value Indicates the number of containers within Kubernetes pods that are in a waiting state. This metric is important for understanding how many containers are unable to progress to a running state - which could suggest resource bottlenecks or configuration issues. Previously known as ed_k8s_metric_kube_pod_container_status_waiting.value
k8s.ksm.pod.status_phase.value Indicates the different phases of Kubernetes pods - which include Pending - Running - Succeeded - Failed - and Unknown. This metric is valuable for understanding the lifecycle stages of pods within the Kubernetes cluster. Previously known as ed_k8s_metric_kube_pod_status_phase.value
k8s.ksm.statefulset_metadata_generation.value Reflects the generation number of a Kubernetes StatefulSet’s metadata. This is used to track updates and changes to the StatefulSet’s configuration in the cluster. Previously known as ed_k8s_metric_kube_statefulset_metadata_generation.value

Kubelet Metrics

Metric Name Description
k8s.kubelet.active_pods.value Measures the number of active pods that are currently registered and running on the kubelet in any node. The common name is kubelet_active_pods
k8s.kubelet.admission_rejections.value Tallies the number of pod admission rejections by the kubelet due to various policy or resource constraints. The common name is kubelet_admission_rejections_total
k8s.kubelet.certificate_manager.client_expiration_renew_errors.value Logs errors occurring during the renewal of client certificates by the kubelet’s certificate manager. The common name is kubelet_certificate_manager_client_expiration_renew_errors
k8s.kubelet.certificate_manager.client_ttl_seconds.value Measures the remaining lifetime of client certificates managed by the kubelet before they expire. The common name is kubelet_certificate_manager_client_ttl_seconds
k8s.kubelet.certificate_manager.server_ttl_seconds.value Captures the time-to-live for server certificates managed by the kubelet indicating when they need renewal. The common name is kubelet_certificate_manager_server_ttl_seconds
k8s.kubelet.cgroup.manager_duration_seconds.value Reports the duration of time spent by the kubelet’s cgroup manager in managing cgroups operations. The common name is kubelet_cgroup_manager_duration_seconds
k8s.kubelet.cgroup.version.value Denotes the version of the cgroup used by the kubelet for resource isolation and control. The common name is kubelet_cgroup_version
k8s.kubelet.container.aligned_compute_resources.value Represents the number of containers aligned to compute resource limits for efficient operation. The common name is kubelet_container_aligned_compute_resources_count
k8s.kubelet.container.log_filesystem_used_bytes.value Indicates the total bytes used by container log filesystem on the node managed by the kubelet. The common name is kubelet_container_log_filesystem_used_bytes
k8s.kubelet.cpu_manager.exclusive_cpu_allocation.value Counts the number of CPUs exclusively allocated to containers by the kubelet’s CPU manager. The common name is kubelet_cpu_manager_exclusive_cpu_allocation_count
k8s.kubelet.cpu_manager.pinning_errors.value Tracks errors encountered by the CPU manager within kubelet during CPU pinning activities. The common name is kubelet_cpu_manager_pinning_errors_total
k8s.kubelet.cpu_manager.pinning_requests.value Records the total number of CPU pinning requests managed by kubelet’s CPU manager. The common name is kubelet_cpu_manager_pinning_requests_total
k8s.kubelet.cpu_manager.shared_pool_size_millicores.value Denotes the size of the shared CPU pool in millicores as managed by the kubelet’s CPU manager. The common name is kubelet_cpu_manager_shared_pool_size_millicores
k8s.kubelet.credential_provider_plugin_errors.value Counts the number of errors encountered by the credential provider plugin within kubelet. The common name is kubelet_credential_provider_plugin_errors
k8s.kubelet.desired_pods.value Indicates the number of pods desired to be running on the kubelet as specified by the pod scheduler. The common name is kubelet_desired_pods
k8s.kubelet.device_plugin.registration.value Captures the total registrations of device plugins with kubelet indicating successful device discovery and resource advertisement. The common name is kubelet_device_plugin_registration_total
k8s.kubelet.evented_pleg.connection_error.value Logs the number of connection errors encountered by kubelet’s evented Pod Lifecycle Event Generator (PLEG). The common name is kubelet_evented_pleg_connection_error_count
k8s.kubelet.evented_pleg.connection_success.value Reports the number of successful connections made by kubelet’s evented PLEG. The common name is kubelet_evented_pleg_connection_success_count
k8s.kubelet.evictions.value Reflects the total number of pod evictions performed by kubelet for various reasons such as resource constraints or policy violations. The common name is kubelet_evictions
k8s.kubelet.graceful_shutdown.end_time_seconds.value Marks the timestamp when a graceful shutdown sequence concludes noting its duration. The common name is kubelet_graceful_shutdown_end_time_seconds
k8s.kubelet.graceful_shutdown.start_time_seconds.value Marks the timestamp when a graceful shutdown sequence starts indicating the onset of closing down procedures. The common name is kubelet_graceful_shutdown_start_time_seconds
k8s.kubelet.http.inflight_requests.value Measures the number of HTTP requests that are currently being processed by the kubelet. The common name is kubelet_http_inflight_requests
k8s.kubelet.http.requests.value Records the total number of HTTP requests made to the kubelet. The common name is kubelet_http_requests_total
k8s.kubelet.image_garbage_collected.value Indicates the total number of image garbage collections performed by the kubelet removing unused images to free space. The common name is kubelet_image_garbage_collected_total
k8s.kubelet.lifecycle_handler_http_fallbacks.value Counts the fallbacks to HTTP lifecycle handlers when gRPC handlers are not available. The common name is kubelet_lifecycle_handler_http_fallbacks_total
k8s.kubelet.managed_ephemeral_containers.value Shows the number of ephemeral containers managed by the kubelet designated for debugging. The common name is kubelet_managed_ephemeral_containers
k8s.kubelet.memory_manager.pinning_errors.value Logs the total memory pinning errors encountered by the memory manager within the kubelet. The common name is kubelet_memory_manager_pinning_errors_total
k8s.kubelet.memory_manager.pinning_requests.value Represents the number of memory pinning requests processed by the kubelet’s memory manager. The common name is kubelet_memory_manager_pinning_requests_total
k8s.kubelet.mirror_pods.value Refers to the count of mirror pods which are static pods mirrored and managed by the kubelet. The common name is kubelet_mirror_pods
k8s.kubelet.node.name Specifies the name of the node in the cluster as identified by the kubelet. The common name is kubelet_node_name
k8s.kubelet.node.startup_duration_seconds.value Tracks the duration in seconds taken for a node to startup and become operational. The common name is kubelet_node_startup_duration_seconds
k8s.kubelet.node.startup_post_registration_duration_seconds.value Measures the duration in seconds taken for post-registration startup activities on a node managed by the kubelet. The common name is kubelet_node_startup_post_registration_duration_seconds
k8s.kubelet.node.startup_pre_kubelet_duration_seconds.value Captures the time spent in pre-kubelet startup activities required before starting the kubelet. The common name is kubelet_node_startup_pre_kubelet_duration_seconds
k8s.kubelet.node.startup_pre_registration_duration_seconds.value Records the time in seconds allocated to pre-registration setup prior to a node registering with the cluster. The common name is kubelet_node_startup_pre_registration_duration_seconds
k8s.kubelet.node.startup_registration_duration_seconds.value Indicates the duration for the registration process of a node within a cluster managed by kubelet. The common name is kubelet_node_startup_registration_duration_seconds
k8s.kubelet.orphan_pod.cleaned_volumes_errors.value Logs errors encountered during the cleanup process of volumes associated with orphaned pods. The common name is kubelet_orphan_pod_cleaned_volumes_errors
k8s.kubelet.orphan_pod.cleaned_volumes.value Measures the number of volumes effectively cleaned that were left by orphaned pods. The common name is kubelet_orphan_pod_cleaned_volumes
k8s.kubelet.orphan_pod.runtime_pods.value Counts the total orphaned runtime pods found and handled by kubelet. The common name is kubelet_orphaned_runtime_pods_total
k8s.kubelet.pleg.discard_events.value Tracks the events discarded by the Pod Lifecycle Event Generator (PLEG) in the kubelet. The common name is kubelet_pleg_discard_events`
k8s.kubelet.pleg.last_seen_seconds.value Denotes the time in seconds since the last event was effectively seen by the PLEG. The common name is kubelet_pleg_last_seen_seconds
k8s.kubelet.pod.resources_endpoint_errors_get_allocatable.value Captures errors on requests made for getting allocatable resources via pod resource endpoint. The common name is kubelet_pod_resources_endpoint_errors_get_allocatable
k8s.kubelet.pod.resources_endpoint_errors_get.value Logs the number of errors encountered when the pod resource endpoint fails to return correctly requested data. The common name is kubelet_pod_resources_endpoint_errors_get
k8s.kubelet.pod.resources_endpoint_errors_list.value Captures errors during listing operations by the pod resources endpoint. The common name is kubelet_pod_resources_endpoint_errors_list
k8s.kubelet.pod.resources_endpoint_requests_get_allocatable.value Counts requests made for obtaining the allocatable resources via pod resources endpoint. The common name is kubelet_pod_resources_endpoint_requests_get_allocatable
k8s.kubelet.pod.resources_endpoint_requests_get.value Registers the total requests received at the pod resources endpoint when fetching specific resources. The common name is kubelet_pod_resources_endpoint_requests_get
k8s.kubelet.pod.resources_endpoint_requests_list.value Monitors the number of list requests received by the pod resources endpoint. The common name is kubelet_pod_resources_endpoint_requests_list
k8s.kubelet.pod.resources_endpoint_requests.value Represents the total number of requests handled by the pod resources endpoint of kubelet. The common name is kubelet_pod_resources_endpoint_requests_total
k8s.kubelet.preemptions.value Indicates incidences where the kubelet preemptively deschedules lower-priority pods to allocate resources for higher-priority ones. The common name is kubelet_preemptions
k8s.kubelet.process.cpu_seconds.rate Measures the rate of CPU time consumed by the Kubelet process. This metric is essential for monitoring the CPU utilization of the Kubelet service. Previously known as ed_k8s_metric_process_cpu_seconds.rate
k8s.kubelet.process.cpu_seconds.value Measures the total CPU seconds consumed by the Kubelet process. This metric provides insight into the overall CPU time that the Kubelet service has utilized. Previously known as ed_k8s_metric_process_cpu_seconds.value and the common name is process_cpu_seconds_total
k8s.kubelet.process.resident_memory_bytes.value Measures the resident memory bytes used by the Kubelet process. This provides insights into the current memory usage footprint of the Kubelet service. Previously known as ed_k8s_metric_process_resident_memory_bytes.value and the common name is process_resident_memory_bytes
k8s.kubelet.rest_client.requests.rate Captures the rate of requests made by the rest client in the Kubelet. This helps in monitoring the request load handled by the Kubelet. Previously known as ed_k8s_metric_rest_client_requests.rate
k8s.kubelet.rest_client.requests.value Represents the total number of requests made by the Kubelet’s REST client. This metric is useful for monitoring request activity within the Kubelet. Previously known as ed_k8s_metric_rest_client_requests.value and the common name is rest_client_requests_total
k8s.kubelet.restarted_pods.value Captures the total pod restarts initiated by the kubelet due to failure recovery or updates. The common name is kubelet_restarted_pods_total
k8s.kubelet.running_containers.value Represents the number of running containers managed by the Kubelet. This metric helps monitor the status and count of active containers under Kubelet’s management. Previously known as ed_k8s_metric_kubelet_running_containers.value
k8s.kubelet.running_pods.value Represents the count of pods currently running under the management of the Kubelet. This is useful for monitoring the operational status of pods in the Kubernetes environment. Previously known as ed_k8s_metric_kubelet_running_pods.value
k8s.kubelet.runtime.operations_errors.rate Measures the rate of errors in runtime operations handled by the Kubelet. This metric is essential for detecting and monitoring operational errors within Kubelet’s runtime processes. Previously known as ed_k8s_metric_kubelet_runtime_operations_errors.rate
k8s.kubelet.runtime.operations_errors.value Measures the total number of errors encountered in runtime operations by the Kubelet. This provides insights into the reliability and error frequency in Kubelet’s operations. Previously known as ed_k8s_metric_kubelet_runtime_operations_errors.value
k8s.kubelet.runtime.operations.rate Captures the rate of runtime operations handled by the Kubelet. This metric provides insights into the operational throughput managed by the Kubelet. Previously known as ed_k8s_metric_kubelet_runtime_operations.rate
k8s.kubelet.runtime.operations.value Represents the total number of runtime operations performed by the Kubelet. This provides insights into the activity level of Kubelet’s operations. Previously known as ed_k8s_metric_kubelet_runtime_operations.value
k8s.kubelet.server_expiration_renew_errors.value Records errors related to renewals of server’s expiring certificates managed by kubelet. The common name is kubelet_server_expiration_renew_errors
k8s.kubelet.sleep_action_terminated_early.value Measures the occurrences where sleep actions scheduled by kubelet are ended prematurely. The common name is kubelet_sleep_action_terminated_early_total
k8s.kubelet.started_containers_errors.value Tracks errors encountered while starting containers managed by the kubelet. The common name is kubelet_started_containers_errors_total
k8s.kubelet.started_containers.value Counts the total number of containers successfully started by the kubelet. The common name is kubelet_started_containers_total
k8s.kubelet.started_host_process_containers_errors.value Logs the number of errors occurring in starting host process containers. The common name is kubelet_started_host_process_containers_errors_total
k8s.kubelet.started_host_process_containers.value Measures the total of host process containers started by the kubelet. The common name is kubelet_started_host_process_containers_total
k8s.kubelet.started_pods_errors.value Indicates error instances during the starting of pods by the kubelet. The common name is kubelet_started_pods_errors_total
k8s.kubelet.started_pods.value Captures the total count of pods that were successfully initiated by the kubelet. The common name is kubelet_started_pods_total
k8s.kubelet.topology_manager.admission_errors.value Logs the number of times resource admission requests failed due to topology constraints. The common name is kubelet_topology_manager_admission_errors_total
k8s.kubelet.topology_manager.admission_requests.value Captures the total requests made for resource admission judged by topology manager policies within the kubelet. The common name is kubelet_topology_manager_admission_requests_total
k8s.kubelet.volume_manager.total_volumes.value Indicates the total number of volumes managed by the Kubelet’s volume manager. This metric is important for monitoring storage utilization and volume management within the Kubernetes environment. Previously known as ed_k8s_metric_volume_manager_total_volumes.value and the common name is volume_manager_total_volumes
k8s.kubelet.working_pods.value Signifies the number of pods that are actively being processed by the kubelet on a node. The common name is kubelet_working_pods

Node Exporter Metrics

Metric Name Description
k8s.node.cpu.seconds.rate Measures the rate of CPU seconds used by the node. This is important for tracking the CPU usage across the entire node in Kubernetes. Previously known as ed_k8s_metric_node_cpu_seconds.rate
k8s.node.cpu.seconds.value Measures the total CPU time consumed by the node - expressed in seconds. This metric is essential for understanding the overall CPU resource usage of the node. Previously known as ed_k8s_metric_node_cpu_seconds.value and the common name is node_cpu_seconds_total
k8s.node.disk.read_bytes.value Accounts for the total number of bytes read from disks on the node. The common name is node_disk_read_bytes_total
k8s.node.disk.reads_completed.value Records the total number of disk read operations completed on a node. The common name is node_disk_reads_completed_total
k8s.node.disk.writes_completed.value Captures the total number of disk write operations concluded on a node. The common name is node_disk_writes_completed_total
k8s.node.disk.written_bytes.value Shows the total bytes written to disks on the node. The common name is node_disk_written_bytes_total
k8s.node.filesystem.avail_bytes.value Tracks the available filesystem bytes on a node. This metric indicates how much storage space is left for use. Previously known as ed_k8s_metric_node_filesystem_avail_bytes.value and the common name is node_filesystem_avail_bytes
k8s.node.filesystem.free_bytes.value Expresses the amount of free space in bytes available on the node’s filesystem. The common name is node_filesystem_free_bytes
k8s.node.filesystem.size_bytes.value Measures the total size of the filesystem in bytes on a node. This provides insights into the total storage capacity available on the node. Previously known as ed_k8s_metric_node_filesystem_size_bytes.value and the common name is node_filesystem_size_bytes
k8s.node.load.15min.value Measures the 15-minute average load on a Kubernetes node. This metric provides insight into the node’s load and processing demand over a 15-minute interval - which can help identify periods of high computational demand or average system load in the context of Kubernetes clusters. Previously known as ed_k8s_metric_node_load15.value and the common name is node_load15
k8s.node.load.1min.value Provides the node’s one-minute load average indicating average system activity or CPU demand. The common name is node_load1
k8s.node.load.5min.value Displays the five-minute load average on a node reflecting a smoother CPU demand or system activity over time. The common name is node_load5
k8s.node.memory.available_bytes.value Tracks the amount of available memory on a Kubernetes node in bytes. This metric is vital for monitoring memory availability - ensuring that sufficient memory resources are allocated for applications running on the node. Previously known as ed_k8s_metric_node_memory_mem_available_bytes.value and the common name is node_memory_MemAvailable_bytes
k8s.node.memory.buffers_bytes.value Calculates the memory being used by kernel buffers on the node. The common name is node_memory_Buffers_bytes
k8s.node.memory.cached_bytes.value Sums up the memory capacity utilized for caching filesystem metadata and file data on the node. The common name is node_memory_Cached_bytes
k8s.node.memory.free_bytes.value Notes down the amount of memory in bytes that is free and available for use on a node. The common name is node_memory_MemFree_bytes
k8s.node.memory.total_bytes.value Measures the total memory capacity of a Kubernetes node in bytes. This metric is crucial for understanding the physical memory resources available on the node - helping to ensure proper memory management and allocation across the cluster. Previously known as ed_k8s_metric_node_memory_mem_total_bytes.value and the common name is node_memory_MemTotal_bytes
k8s.node.network.receive_bytes.rate Measures the rate at which bytes are received on a network interface of a Kubernetes node. This metric is essential for monitoring network traffic and understanding network bandwidth utilization to ensure the node’s network capacity is not exceeded. Previously known as ed_k8s_metric_node_network_receive_bytes.rate
k8s.node.network.receive_bytes.value Represents the total bytes received on a network interface of a Kubernetes node. This metric is important for evaluating the overall network traffic and bandwidth usage on the node - which helps in managing network performance and capacity planning. Previously known as ed_k8s_metric_node_network_receive_bytes.value and the common name is node_network_receive_bytes_total
k8s.node.network.receive_drop.value Logs the total dropped incoming network packets on all interfaces of a node. The common name is node_network_receive_drop_total
k8s.node.network.receive_packets.value Captures the total number of incoming network packets received on all interfaces of a node. The common name is node_network_receive_packets_total
k8s.node.network.transmit_bytes.rate Measures the rate at which bytes are transmitted on a network interface of a Kubernetes node. This metric is crucial for understanding outbound network traffic and helps in monitoring network performance to ensure sufficient network capacity. Previously known as ed_k8s_metric_node_network_transmit_bytes.rate
k8s.node.network.transmit_bytes.value Represents the total bytes transmitted from a network interface of a Kubernetes node. This metric is important for assessing the total outbound network traffic from the node - aiding in the management of network throughput and bandwidth usage. Previously known as ed_k8s_metric_node_network_transmit_bytes.value and the common name is node_network_transmit_bytes_total
k8s.node.network.transmit_drop.value Logs the total number of network packets that were dropped during transmission on the node’s interfaces. The common name is node_network_transmit_drop_total
k8s.node.network.transmit_packets.value Records the total outgoing network packets transmitted by all interfaces on a node. The common name is node_network_transmit_packets_total

Traffic Metrics

Metric Name Description
k8s.traffic.communication.count Provides a count of network communications or transactions occurring within a Kubernetes environment. This metric helps in tracking the volume of communication traffic between Kubernetes components. Previously known as ed_k8s_traffic_communication.count
k8s.traffic.communication.latency.avg Measures the average latency of communication events occurring within a Kubernetes environment. This metric is crucial for analyzing the performance and responsiveness of inter-component communications. Previously known as ed_k8s_traffic_latency.avg
k8s.traffic.communication.latency.p95 Measures the 95th percentile communication latency within a Kubernetes environment. This metric is useful for analyzing tail-end latency and identifying potential bottlenecks or delays affecting a small percentage of communication interactions. Previously known as ed_k8s_traffic_latency.p95
k8s.traffic.communication.read_bytes.sum Measures the total bytes read or inbound traffic within a Kubernetes environment. This metric is important for assessing the volume of incoming network data and understanding network usage patterns. Previously known as ed_k8s_traffic_in.sum
k8s.traffic.communication.write_bytes.sum Measures the total bytes written or outbound traffic within a Kubernetes environment. This metric is important for assessing the volume of outgoing network data and understanding network usage patterns. Previously known as ed_k8s_traffic_out.sum