Flow Control
3 minute read
Overview
Edge Delta provides intelligent flow control that balances full-fidelity data routing during incidents with cost-effective sampling during normal operations. Rather than static sampling rates, flow control adjusts data volume dynamically based on real-time conditions, enabling teams to meet both cost objectives and troubleshooting requirements.
Flow control enables teams to automatically send 100% of data during alerts or incidents while maintaining aggressive sampling (1-10%) during steady-state operations. This approach reduces observability costs by up to 95% while preserving complete context when troubleshooting matters most.
How Dynamic Rate Sampling Works
Flow control operates through a four-stage pipeline that separates configuration from execution:
Initial Tagging: Incoming telemetry is marked with a default sampling rate in its attributes using OTTL transformation statements.
Lookup Consultation: The system checks a lookup table (typically a CSV file) for the current flow rate (0-100%) and expiration date based on service name or other attributes.
Conditional Logic: The pipeline verifies if the expiration date has passed, applying either the default rate or the lookup value. This time-based expiration ensures temporary rate changes automatically revert to normal.
Probabilistic Sampling: The Sample Processor executes the final sampling decision using the dynamically assigned rate.
This architecture decouples sampling configuration from pipeline deployment, enabling operational changes without code modifications or pipeline restarts.
Management and Automation
Teams manage flow control by updating a simple CSV lookup file containing service names, flow rates, and expiration dates. Changes can be made:
- Manually via the Edge Delta UI
- Programmatically via API
- Automatically through monitor-triggered actions
Edge Delta monitors can automatically trigger flow rate adjustments when specific conditions occur:
- High error rates can switch to 100% sampling
- SLA violations can increase sampling for affected services
- Budget alerts can reduce sampling for non-critical services
- Incident creation can enable full-fidelity for troubleshooting
This tight integration eliminates the need for external workflow orchestration and ensures sampling rates respond instantly to changing conditions.
Use Cases
Incident Response
Consider a payment service running at 10% sampling ($100/month cost) during normal operations. When an alert triggers, flow control automatically switches to 100% sampling for 2 hours, providing full-fidelity data for troubleshooting at an incremental cost of just $6.67. After the expiration period, sampling automatically reverts to 10%.
Service Criticality Tiers
Assign different sampling rates based on service criticality:
| Service Tier | Sampling Rate |
|---|---|
| Critical (payment, auth) | 100% |
| Standard services | 20% |
| Batch jobs | 5% |
| Development services | 1% |
Time-Based Adjustment
Sampling rates can adjust based on time of day or business cycles:
- During peak hours with high traffic, 5% sampling helps manage volume
- Off-hours and weekends can use 20% sampling for better visibility
Cost Management
When approaching budget limits (such as 96% of monthly budget consumed), flow control can automatically reduce sampling for non-critical services while maintaining 100% for business-critical workloads.
Benefits
Cost reduction: Aggressive sampling (1-10%) during stable periods can reduce observability costs by 90-95% compared to full-fidelity ingestion while maintaining continuous visibility.
Automatic incident response: Automatic full-fidelity routing during alerts ensures rich context for incident investigation without manual intervention.
Closed-loop automation: Integration with Edge Delta monitors enables the observability system itself to manage data volume based on operational conditions.
Simplified management: Point-and-click CSV management eliminates complex engineering workflows and decouples data volume configuration from pipeline deployment cycles.
Related Documentation
- Sample Processor - Configure probabilistic sampling
- Lookup Processor - Use lookup tables for dynamic configuration
- Use Lookup Tables - Guide to managing lookup tables
- Circuit Breaker - Protect destinations from overload
- Monitors - Configure automated alerting and actions
For step-by-step instructions on implementing flow control with dynamic sampling, including complete configuration examples and lookup table setup, see How to Implement Flow Control with Dynamic Sampling.