Prerequisites
The following exporters must be configured and running:| Component | Purpose | Required |
|---|---|---|
| kube-state-metrics | Kubernetes resource state | ✅ Required |
| Prometheus Node Exporter | Node-level metrics | ✅ Required |
| cAdvisor Exporter | Container-level metrics | ✅ Required |
| OpenTelemetry Collector | Service discovery and tracing | ⭕ Optional |
| CloudWatch Exporter | AWS RDS discovery | ⭕ Optional |
Required Metrics
Service Discovery Metrics
| Metric Name | Labels | Purpose |
|---|---|---|
target_info | job, service_name, service_version, k8s_deployment_name, k8s_namespace_name, k8s_pod_name, k8s_node_name | Service discovery and metadata collection |
Additional useful labels for target_info:
host_arch,container_id,process_runtime_name,process_runtime_versionos_type,os_version,telemetry_sdk_name,telemetry_sdk_language
Service Dependencies
| Metric Name | Labels | Purpose |
|---|---|---|
service_dependencies | client, server, cluster | Inter-service communication tracking |
Observability Metrics for Investigation & Analysis
Important Note: The metrics listed above are specifically for entity graph building and relationship discovery. For comprehensive root cause analysis and performance investigation, you’ll also need standard observability metrics.Required Standard Metrics Categories
| Category | Purpose | Documentation |
|---|---|---|
| Container Metrics | Pod/container performance analysis | cAdvisor Metrics |
| Node Metrics | Node-level performance analysis | Node Exporter Metrics |
| Service Metrics | Application performance analysis | OpenTelemetry Metrics, Prometheus Best Practices |
| Database Metrics | External service performance | CloudWatch RDS Metrics |
- Performance investigation - CPU, memory, disk, network utilization
- Root cause analysis - Error rates, latencies, saturation metrics
- Capacity planning - Resource usage trends and limits
- Service health monitoring - Request rates, error rates, response times
Kubernetes Infrastructure Metrics
Node Metrics
| Metric Name | Labels | Purpose |
|---|---|---|
kube_node_info | node, cluster, container_runtime_version, provider_id, os_image, kubelet_version, kernel_version | Node discovery and metadata |
Pod Metrics
| Metric Name | Labels | Purpose |
|---|---|---|
kube_pod_info | pod, namespace, node, cluster, created_by_kind, created_by_name | Pod discovery and ownership |
kube_pod_container_info | pod, namespace, container, image, image_spec, cluster | Container information |
kube_pod_labels | pod, namespace, cluster, label_* | Pod label selectors for service relationships |
kube_pod_annotations | pod, namespace, cluster, annotation_* | Pod metadata and relationships |
Deployment and Scaling Metrics
| Metric Name | Labels | Purpose |
|---|---|---|
kube_deployment_spec_replicas | deployment, namespace, cluster | Deployment scaling information |
kube_deployment_status_replicas_available | deployment, namespace, cluster | Deployment status |
kube_replicaset_spec_replicas | replicaset, namespace, cluster | ReplicaSet scaling |
kube_replicaset_status_replicas_ready | replicaset, namespace, cluster | ReplicaSet status |
kube_statefulset_spec_replicas | statefulset, namespace, cluster | StatefulSet scaling |
kube_statefulset_status_replicas_ready | statefulset, namespace, cluster | StatefulSet status |
Container Resources
| Metric Name | Labels | Purpose |
|---|---|---|
kube_pod_container_resource_limits | pod, namespace, container, resource, unit, cluster | Resource allocation tracking |
kube_pod_container_resource_requests | pod, namespace, container, resource, unit, cluster | Resource allocation tracking |
Kubernetes Resource State Metrics
ConfigMaps
| Metric Name | Labels | Purpose |
|---|---|---|
kube_configmap_info | configmap, namespace, cluster, resource_version, uid | Configuration tracking |
kube_configmap_labels | configmap, namespace, cluster, label_* | ConfigMap relationship detection |
kube_configmap_annotations | configmap, namespace, cluster, annotation_* | ConfigMap metadata relationships |
kube_configmap_metadata_resource_version | configmap, namespace, cluster | Detects changes in ConfigMap metadata |
Secrets
| Metric Name | Labels | Purpose |
|---|---|---|
kube_secret_info | secret, namespace, cluster, type, resource_version, uid | Secret tracking |
kube_secret_labels | secret, namespace, cluster, label_* | Secret relationship detection |
kube_secret_annotations | secret, namespace, cluster, annotation_* | Secret metadata relationships |
kube_secret_metadata_resource_version | secret, namespace, cluster | Detects changes in secret metadata |
Persistent Storage
| Metric Name | Labels | Purpose |
|---|---|---|
kube_persistentvolumeclaim_info | persistentvolumeclaim, namespace, cluster, volumename, storageclass, uid | Storage tracking |
kube_persistentvolume_info | persistentvolume, cluster, storageclass, uid, host_path | Volume tracking |
kube_pod_spec_volumes_persistentvolumeclaim_info | pod, namespace, volume, persistentvolumeclaim, cluster | Pod-storage relationships |
Services
| Metric Name | Labels | Purpose |
|---|---|---|
kube_service_info | service, namespace, cluster, type, cluster_ip, uid | Service networking |
kube_service_labels | service, namespace, cluster, label_* | Service label selectors for pod targeting |
kube_service_annotations | service, namespace, cluster, annotation_* | Service routing and ingress relationships |
Ingress
| Metric Name | Labels | Purpose |
|---|---|---|
kube_ingress_info | ingress, namespace, cluster, ingressclass, uid | Traffic routing |
kube_ingress_path | ingress, namespace, cluster, service_name, service_port, host, path | Routing relationships |
kube_ingress_labels | ingress, namespace, cluster, label_* | Ingress routing and controller relationships |
kube_ingress_annotations | ingress, namespace, cluster, annotation_* | Ingress routing rules and TLS configuration |
kube_ingress_metadata_resource_version | ingress, namespace, cluster | Detects changes in Ingress configuration |
External Dependencies (Optional)
AWS RDS Discovery
| Metric Name | Labels | Purpose |
|---|---|---|
aws_rds_database_connections_average | DBInstanceIdentifier, region | Database discovery |
Label Requirements
Critical Labels
These labels must be present and consistent across metrics:| Label | Description | Used In |
|---|---|---|
cluster | Kubernetes cluster identifier | All kube-state metrics |
namespace | Kubernetes namespace | Pod, Service, ConfigMap, Secret metrics |
job | Service job identifier | Service discovery |
Configurable Label Mappings
If your metrics use different label names, configure the label mapping:| Standard Label | Default Mapping | Alternative Names |
|---|---|---|
job | job | service_name, application |
namespace | namespace | k8s_namespace_name |
pod | pod | k8s_pod_name |
node | node | k8s_node_name |
container | container | container_name |
Validation
Health Check Queries
Use these Prometheus queries to verify metric availability:| Check | Query | Expected Result |
|---|---|---|
| Service Discovery | count by (job) (target_info) | > 0 services per job |
| Kubernetes Nodes | count by (cluster) (kube_node_info) | > 0 nodes per cluster |
| Kubernetes Pods | count by (cluster) (kube_pod_info) | > 0 pods per cluster |
| ConfigMaps | count by (namespace) (kube_configmap_info) | ≥ 0 per namespace |
| Services | count by (namespace) (kube_service_info) | ≥ 0 per namespace |
Missing Metrics Troubleshooting
| Symptom | Likely Cause | Solution |
|---|---|---|
| No services in graph | Missing target_info | Configure OpenTelemetry collector |
| Missing Kubernetes resources | kube-state-metrics not running | Deploy and configure kube-state-metrics |
| Incomplete relationships | Missing label/annotation metrics | Enable label and annotation collection |
| Poor relationship detection | Inconsistent labeling | Standardize resource labels |
Integration Requirements
Minimum Versions
| Component | Minimum Version | Notes |
|---|---|---|
| kube-state-metrics | v2.0+ | For complete label support |
| OpenTelemetry Collector | v0.60+ | For target_info compatibility |
| Prometheus | v2.30+ | For advanced query features |
RBAC Requirements
Ensure kube-state-metrics has read access to:- Nodes, Pods, Deployments, ReplicaSets, StatefulSets
- ConfigMaps, Secrets, Services, Ingresses
- PersistentVolumes, PersistentVolumeClaims