Ingest AWS RDS Metrics from CloudWatch

Configure Edge Delta to ingest AWS RDS metrics from CloudWatch via S3 for cost-effective monitoring and correlation.

Overview

Amazon Relational Database Service (RDS) is a managed AWS service that simplifies deployment and scaling of relational databases like PostgreSQL and MariaDB. RDS automatically sends database metrics to AWS CloudWatch, including CPU utilization, replication status, and read/write IOPS.

Edge Delta’s Telemetry Pipelines enable teams to:

  • Extract RDS metrics from CloudWatch via S3
  • Standardize metrics using OpenTelemetry formats
  • Correlate database metrics with external telemetry data pre-index
  • Route metrics to cost-effective downstream destinations

Architecture

The ingestion flow consists of:

  1. CloudWatch Metric Streams send RDS metrics to Kinesis Data Firehose
  2. Kinesis Data Firehose batches and delivers metrics to an S3 bucket
  3. S3 Event Notifications notify an SQS queue when new data arrives
  4. Edge Delta Agent polls SQS and ingests metrics from S3
  5. Telemetry Pipeline processes and routes metrics to destinations

Prerequisites

  • AWS account with RDS instances
  • IAM permissions to create:
    • CloudWatch Metric Streams
    • S3 buckets and event notifications
    • SQS queues
  • Edge Delta account with cloud pipeline access

Configure AWS Components

Create an SQS Queue

  1. Open the Amazon SQS console
  2. Create a Standard queue with a descriptive name (e.g., rds-metrics-queue)
  3. Configure the access policy to allow S3 to send messages:
{
    "Sid": "s3_send_statement",
    "Effect": "Allow",
    "Principal": {
        "Service": "s3.amazonaws.com"
    },
    "Action": [
        "SQS:SendMessage"
    ],
    "Resource": "arn:aws:sqs:AWS_REGION:AWS_ACCOUNT_ID:QUEUE_NAME",
    "Condition": {
        "ArnLike": {
            "aws:SourceArn": "arn:aws:s3:::BUCKET_NAME"
        },
        "StringEquals": {
            "aws:SourceAccount": "AWS_ACCOUNT_ID"
        }
    }
}
  1. Save the queue URL for Edge Delta configuration

Create a CloudWatch Metric Stream

  1. Navigate to CloudWatch > Metrics > Streams
  2. Click Create metric stream
  3. Select metrics to include:
    • For comprehensive monitoring: Choose AWS/RDS: All metric names
    • For specific metrics: Select individual RDS metrics
  4. Configure the destination:
    • Choose Amazon Kinesis Data Firehose
    • Create or select a Firehose delivery stream with:
      • Destination: Amazon S3
      • Output format: JSON (recommended) or Parquet
      • Compression: GZIP (recommended for cost savings)
      • Buffer interval: 60 seconds (for near real-time delivery)
  5. Note the metric stream name and S3 bucket path

Configure S3 Event Notifications

  1. Navigate to the S3 bucket receiving metric stream data
  2. Go to Properties > Event notifications
  3. Click Create event notification:
    • Event name: rds-metrics-notification
    • Event types: Select All object create events
    • Destination: Choose SQS queue
    • Select the SQS queue created earlier
  4. Save the configuration

Configure IAM Permissions

Create an IAM policy for Edge Delta to access AWS resources:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EdgeDeltaRDSMetricsAccess",
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteMessage",
                "sqs:DeleteMessageBatch",
                "sqs:ReceiveMessage",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::BUCKET_NAME/*",
                "arn:aws:sqs:REGION:ACCOUNT_ID:QUEUE_NAME"
            ]
        }
    ]
}

Attach this policy to an IAM user or role for Edge Delta authentication.

Configure Edge Delta Pipeline

Create a Cloud Pipeline

  1. Log into the Edge Delta web app
  2. Navigate to Pipelines > Cloud Pipelines
  3. Click Create Cloud Pipeline
  4. Provide a name (e.g., rds-metrics-pipeline)

Add S3 Source Node

In the pipeline editor, click Add Node and select S3 from the source nodes. The S3 source node configuration requires the SQS queue URL and AWS region as mandatory parameters. For detailed parameter descriptions, refer to the S3 Source Node documentation.

CloudWatch Metric Streams typically compress data using gzip, so you should set the compression parameter accordingly. For authentication, you can use either IAM roles with assumed permissions or AWS access keys. IAM roles are recommended for production environments as they provide better security through temporary credentials and don’t require storing long-lived access keys.

When using IAM role authentication, the role_arn parameter specifies which role to assume. The external_id parameter is optional but strongly recommended when Edge Delta assumes roles across AWS accounts, as it provides an additional security layer to prevent confused deputy attacks. If you choose access key authentication instead, both aws_key_id and aws_sec_key must be provided together.

nodes:
- name: rds_metrics_s3_input
  type: s3_input
  sqs_url: https://sqs.us-west-2.amazonaws.com/123456789/rds-metrics-queue
  region: us-west-2
  compression: gzip
  role_arn: arn:aws:iam::123456789:role/edge-delta-rds-metrics
  external_id: unique-external-id-12345

Process RDS Metrics

Configure processors to parse and transform CloudWatch metrics into OpenTelemetry format. The processing pipeline handles the Kinesis Firehose JSON structure and extracts meaningful metrics using Edge Delta’s transform processors.

Start with the JSON Unroll processor to extract individual metric records from the Firehose batch. CloudWatch Metric Streams via Kinesis Firehose wrap multiple metrics in a records array when using JSON output format. The unroll processor creates separate telemetry items for each array element, preserving all resource and attribute information. If using Parquet output format in Kinesis Firehose, adjust the processor configuration accordingly as the data structure will differ. For JSON format, ‘records’ is the standard field path.

Use the Copy Field processor to map CloudWatch metric fields to standardized attributes. The processor uses OTTL statements like set(attributes["metric_name"], attributes["metric_record"]["metric_name"]) to copy values between fields. When copying numeric values for metrics, apply type conversion using OTTL functions like Double() to ensure proper data types.

The Extract Metric processor generates proper metric items from the parsed data. Configure extraction rules for each RDS metric type with appropriate units and metric kinds. Gauges represent instantaneous values like CPU utilization, while sums work better for cumulative metrics like IOPS.

processors:
- name: rds_metrics_pipeline
  type: sequence
  processors:
  - type: json_unroll
    metadata: '{"name":"Unroll Firehose Records"}'
    data_types:
    - log
    field_path: body
    json_field_path: records
    new_field_name: metric_record
    
  - type: ottl_transform
    metadata: '{"id":"map_cloudwatch","type":"copy-field","name":"Map CloudWatch Fields"}'
    data_types:
    - log
    statements: set(attributes["metric_name"], attributes["metric_record"]["metric_name"])
    
  - type: ottl_transform
    metadata: '{"id":"extract_value","type":"copy-field","name":"Extract Metric Value"}'
    data_types:
    - log
    statements: set(attributes["metric_value"], Double(attributes["metric_record"]["value"]["sum"]))
    
  - type: ottl_transform
    metadata: '{"id":"map_instance","type":"copy-field","name":"Map DB Instance"}'
    data_types:
    - log
    statements: set(resource["db.instance"], attributes["metric_record"]["dimensions"]["DBInstanceIdentifier"])
      
  - type: extract_metric
    metadata: '{"name":"Generate RDS Metrics"}'
    extract_metric_rules:
    - name: rds_cpu_utilization
      description: RDS instance CPU utilization percentage
      unit: "%"
      gauge:
        value: attributes["metric_value"]
      condition: attributes["metric_name"] == "CPUUtilization"
      
    - name: rds_database_connections
      description: Number of database connections in use
      unit: "1"
      gauge:
        value: attributes["metric_value"]
      condition: attributes["metric_name"] == "DatabaseConnections"
      
    - name: rds_read_iops
      description: Average number of disk read I/O operations per second
      unit: "1/s"
      sum:
        value: attributes["metric_value"]
      condition: attributes["metric_name"] == "ReadIOPS"
      
    - name: rds_write_iops
      description: Average number of disk write I/O operations per second
      unit: "1/s"
      sum:
        value: attributes["metric_value"]
      condition: attributes["metric_name"] == "WriteIOPS"

You can extend these extraction rules to include additional RDS metrics such as FreeStorageSpace (gauge type for available storage), ReplicaLag (gauge type for replication delay in seconds), SwapUsage (gauge type for swap space utilization), and BinLogDiskUsage (gauge type for binary log storage). Each metric should be configured with the appropriate unit and type based on whether it represents an instantaneous value or a cumulative counter.

For high-volume metrics, consider adding an Aggregate Metric processor to reduce data before sending to destinations. This can significantly reduce costs while maintaining metric fidelity through statistical aggregation.

Route to Destinations

Configure destination nodes to send processed metrics:

  • Edge Delta Observability Platform: For monitoring and dashboards
  • OpenTelemetry Collector: For further processing
  • Third-party platforms: Datadog, Splunk, New Relic, etc.

Monitoring and Validation

Enrich RDS Metrics with Context

Use the Lookup processor to enrich RDS metrics with additional context from lookup tables. This processor can add metadata like database environment tags, cost center information, or alert thresholds based on instance identifiers.

Create a CSV lookup table mapping RDS instance names to metadata:

instance_id,environment,team,cost_center,cpu_threshold
prod-mysql-01,production,platform,eng-001,80
staging-postgres-02,staging,platform,eng-002,90

Configure the lookup processor to match on resource["db.instance"] and add enrichment fields:

- type: lookup
  metadata: '{"name":"Enrich RDS Metrics"}'
  location_path: ed://rds_metadata.csv
  reload_period: 10m0s
  match_mode: exact
  key_fields:
  - event_field: resource["db.instance"]
    lookup_field: instance_id
  out_fields:
  - event_field: attributes["environment"]
    lookup_field: environment
  - event_field: attributes["team"]
    lookup_field: team
  - event_field: attributes["cost_center"]
    lookup_field: cost_center

Use the Add Field processor to tag metrics based on conditions. For example, mark high CPU usage:

- type: ottl_transform
  metadata: '{"type":"add-field","name":"Tag High CPU"}'
  condition: attributes["metric_name"] == "CPUUtilization" and attributes["metric_value"] > 80
  statements: set(attributes["alert_severity"], "high")

The Aggregate Metric processor can then group enriched metrics by these new fields for better analysis and routing decisions.

Verify Data Flow

  1. Check SQS queue metrics for message activity
  2. Monitor pipeline logs for ingestion status
  3. Validate metrics appear in destination platforms
  4. Confirm correlation rules are matching expected patterns

Create Dashboards

Configure dashboards in Edge Delta to visualize RDS metrics. See Create a Dashboard for detailed instructions.

Navigate to Dashboards and click New Dashboard to start building. Add Dashboard Variables to make your dashboard interactive:

  • Use Facet Option Variables to filter by resource["db.instance"] for specific database selection
  • Add Metric Name Variables to switch between different RDS metrics dynamically
  • Configure String Variables for environment selection (production, staging, development)

Drag widgets from the toolbox to visualize RDS metrics. Configure time series widgets to display:

  • CPU utilization trends using the rds_cpu_utilization metric
  • IOPS patterns by combining rds_read_iops and rds_write_iops metrics
  • Connection count using rds_database_connections with appropriate thresholds
  • Storage capacity trends with percentage calculations

Group metrics by the enriched attributes from the lookup processor (environment, team, cost_center) to create filtered views. Reference variables in widget configurations using the $variable_key syntax to make dashboards respond to user selections.

Save custom views for specific database instances or environments using the Save View feature. This allows quick access to frequently monitored database configurations without reconfiguring variables each time.

Configure Monitors

Create monitors in Edge Delta’s Observability Platform to track RDS metrics and generate alerts. Configure different monitor types based on your alerting requirements.

Use Metric Threshold Monitors for RDS performance metrics. Set thresholds for CPU utilization, IOPS, and connection counts with appropriate evaluation windows:

  • Database CPU utilization: Alert when above 80% for 5-minute evaluation window
  • Read/Write IOPS: Warn when exceeding baseline by 50% using 15-minute rollup
  • Database connections: Alert when approaching connection limit (e.g., above 90% of max_connections)
  • Storage capacity: Warn at 85% full, alert at 95% full

Configure Pattern Anomaly Monitors to detect unusual database behavior patterns. These monitors use sensitivity settings to identify spikes in error patterns or unusual query patterns in RDS logs when processed alongside metrics.

For complex scenarios involving multiple metrics, create Composite Monitors that evaluate conditions across multiple monitors:

  • Combine high CPU AND high connection count monitors to detect resource exhaustion
  • Use OR logic to alert when either replication lag exceeds threshold OR primary instance shows errors
  • Configure AND logic for correlated issues like high IOPS with increased error rates

Set appropriate aggregation methods (sum, average, max) and rollup windows based on metric characteristics. Use grouping by resource["db.instance"] to receive per-database alerts rather than aggregate notifications.

Best Practices

  • Start with essential metrics: Begin with CPU, IOPS, and connection metrics, then expand coverage
  • Configure retention wisely: Set appropriate S3 retention periods based on compliance requirements
  • Optimize costs: Implement S3 lifecycle policies to transition old metrics to cheaper storage tiers
  • Leverage pre-index processing: Because metrics are standardized and enriched pre-index, you can reduce downstream ingestion and storage costs while still retaining full context in Edge Delta
  • Enrich with correlation: Connect RDS metrics with application traces and logs for full context
  • Apply OpenTelemetry standards: Use semantic conventions for consistent metric naming and attributes

Troubleshooting

No Metrics Appearing

  1. Verify CloudWatch Metric Stream shows “Active” status
  2. Check S3 bucket contains metric files in the configured path
  3. Confirm SQS queue shows message activity
  4. Review Edge Delta pipeline logs for ingestion errors

Authentication Issues

  • Verify IAM policy contains all required S3 and SQS permissions
  • Validate AWS credentials are correctly configured in Edge Delta
  • Confirm region matches where resources are deployed

Data Format Problems

  • Confirm Kinesis Firehose output format matches processor expectations
  • Check JSON parsing for base64 decoding if needed
  • Verify field mappings align with CloudWatch metric structure