Reduce Metric Cardinality
5 minute read
Overview
Metric cardinality - the number of unique timeseries - directly impacts costs, performance, and system health. This guide covers strategies for reducing cardinality at the edge, before metrics reach expensive downstream destinations.
For foundational concepts, see Metric Cardinality.
Strategy 1: Drop high-cardinality attributes
The most effective cardinality reduction is removing attributes that generate excessive unique values.
Common attributes to drop
| Attribute | Why Drop | Alternative |
|---|---|---|
pod_id | Unique per pod instance | Aggregate to service.name |
container_id | Unique per container | Aggregate to pod or service |
request_id | Unique per request | Use traces for request-level detail |
user_id | Unique per user | Hash to buckets or remove |
session_id | Unique per session | Remove from metrics |
Using the Custom processor
The Custom processor executes OTTL statements to drop specific attributes from metrics:
- name: drop_high_cardinality
type: custom
data_types:
- metric
statements:
- delete_key(attributes, "pod_id")
- delete_key(attributes, "container_id")
- delete_key(attributes, "request_id")
Using the Delete Field processor
For simpler cases, the Delete Field processor removes a single field:
- name: delete_pod_id
type: delete_field
data_types:
- metric
field_path: attributes["pod_id"]
Strategy 2: Normalize dynamic values
Dynamic values like URL paths create unbounded cardinality. Normalize them to bounded sets.
URL path normalization
Convert dynamic path segments to placeholders:
| Before | After |
|---|---|
/users/12345 | /users/{id} |
/orders/abc-def-ghi | /orders/{id} |
/products/SKU-99999 | /products/{sku} |
The Custom processor with replace_pattern statements normalizes these paths:
- name: normalize_urls
type: custom
data_types:
- metric
statements:
# Replace numeric IDs
- replace_pattern(attributes["url.path"], "/[0-9]+", "/{id}")
# Replace UUIDs
- replace_pattern(attributes["url.path"], "/[a-f0-9-]{36}", "/{uuid}")
# Replace SKUs
- replace_pattern(attributes["url.path"], "/SKU-[A-Z0-9]+", "/{sku}")
Status code grouping
Reduce granularity by grouping similar values:
| Before | After |
|---|---|
| 200, 201, 204 | 2xx |
| 400, 401, 403, 404 | 4xx |
| 500, 502, 503 | 5xx |
The Custom processor groups status codes using replace_pattern:
- name: group_status_codes
type: custom
data_types:
- metric
statements:
- replace_pattern(attributes["http.status_code"], "^2[0-9]{2}$", "2xx")
- replace_pattern(attributes["http.status_code"], "^3[0-9]{2}$", "3xx")
- replace_pattern(attributes["http.status_code"], "^4[0-9]{2}$", "4xx")
- replace_pattern(attributes["http.status_code"], "^5[0-9]{2}$", "5xx")
Strategy 3: Use aggregation processors
Aggregation naturally reduces cardinality by grouping metrics. The Aggregate Metric processor and Rollup Metric processor provide different levels of reduction.
Aggregate Metric processor
The Aggregate Metric processor uses group_by to specify which attributes to preserve. All others are dropped:
- name: aggregate_metrics
type: aggregate_metric
data_types:
- metric
aggregation_type: sum
interval: 60s
group_by:
- service.name
- http.method
- http.status_code
# Keep only group by keys drops all other attributes
keep_only_group_by_keys: true
Before aggregation: Metrics with pod_id, container_id, request_id, plus the group_by keys
After aggregation: Only service.name, http.method, http.status_code remain
Rollup Metric processor
For maximum reduction, the Rollup Metric processor creates a single aggregated value without any grouping:
- name: rollup_metrics
type: rollup_metric
data_types:
- metric
aggregation_type: sum
interval: 60s
This produces one value per metric name per interval - the lowest possible cardinality.
Strategy 4: Filter metrics by name
Some metrics are not worth the cardinality cost. The Filter processor drops them entirely:
- name: filter_noisy_metrics
type: filter
data_types:
- metric
condition: 'not (name matches "debug\\..*" or name matches "internal\\..*")'
This keeps only metrics that do not start with debug. or internal..
Strategy 5: Conditional reduction by environment
Apply aggressive reduction in development and staging while preserving detail in production. Use the Route processor to direct metrics to different aggregation paths based on environment.
Route processor: Separate metrics by environment using path conditions:
paths:
- path: production
condition: resource["deployment.environment"] == "production"
- path: staging
condition: resource["deployment.environment"] == "staging"
# Unmatched items (dev) go to the default "unmatched" path
Production Aggregate Metric: High resolution with 30-second intervals. Preserve detailed dimensions for troubleshooting:
aggregation_type: sum
interval: 30s
group_by: [service.name, http.method, http.status_code, http.route]
keep_only_group_by_keys: true
Staging Aggregate Metric: Moderate resolution with 60-second intervals. Keep essential dimensions for validation:
aggregation_type: sum
interval: 60s
group_by: [service.name, http.method]
keep_only_group_by_keys: true
Dev Aggregate Metric: Aggressive reduction with 5-minute intervals. Minimize costs while maintaining basic visibility:
aggregation_type: sum
interval: 300s
group_by: [service.name]
keep_only_group_by_keys: true
Example: Complete cardinality reduction pipeline
Combine strategies for comprehensive cardinality control.
Drop high-cardinality attributes: Use the Custom processor with OTTL statements to remove attributes that generate excessive unique values:
statements:
- delete_key(attributes, "pod_id")
- delete_key(attributes, "container_id")
- delete_key(attributes, "request_id")
- delete_key(attributes, "trace_id")
Normalize dynamic values: The Custom processor uses replace_pattern to convert dynamic URL segments to placeholders:
statements:
- replace_pattern(attributes["url.path"], "/[0-9]+", "/{id}")
- replace_pattern(attributes["url.path"], "/[a-f0-9-]{36}", "/{uuid}")
Aggregate metrics: Use the Aggregate Metric processor with keep_only_group_by_keys: true to preserve only the dimensions you need:
aggregation_type: sum
interval: 60s
group_by: [service.name, http.method, http.status_code, url.path]
keep_only_group_by_keys: true
Measuring reduction effectiveness
Track cardinality before and after your pipeline to measure effectiveness:
- Count distinct fingerprints at pipeline input
- Count distinct fingerprints at pipeline output
- Calculate reduction percentage:
(before - after) / before × 100
Use the Pipelines Dashboard to monitor input and output rates.
Best practices
When reducing metric cardinality:
- Start with the highest-cardinality attributes first
- Test reduction in non-production environments
- Preserve attributes you need for alerting and dashboards
- Document which attributes are dropped and why
- Monitor for unexpected cardinality growth from new attributes
See also
- Metric Cardinality - Understand cardinality concepts
- Aggregate Metric Processor - Group and summarize metrics
- Rollup Metric Processor - Create single aggregated values
- Custom Processor - OTTL transformations for metrics
- Delete Field Processor - Remove specific fields
- Filter Processor - Drop items by condition
- Data Reduction - General data reduction strategies