HTTP Pull Connector

Configure the HTTP Pull connector to periodically fetch data from REST APIs and HTTP endpoints with support for dynamic parameters, pagination, and authentication.

13 minute read

Overview

The HTTP Pull connector periodically fetches data from HTTP and HTTPS REST API endpoints. It polls external APIs at regular intervals to collect metrics, status information, configuration data, and telemetry. Content streams into Edge Delta Pipelines for analysis by AI teammates through the Edge Delta MCP connector.

The connector supports dynamic configuration through OTTL expressions, automatic pagination handling, flexible authentication, and configurable retry logic, enabling integration with virtually any REST API.

When you add this streaming connector, it appears as a HTTP Pull source in your selected pipeline. AI teammates access this data by querying the Edge Delta backend with the Edge Delta MCP connector.

Add the HTTP Pull Connector

To add the HTTP Pull connector, you specify the endpoint URL, HTTP method, and polling interval, then deploy to an environment.

Prerequisites

Before configuring the connector, ensure you have:

Edge Delta agent deployed with outbound network access to target APIs
Target API endpoint accessible from Edge Delta agent’s network location
Authentication credentials (API keys, bearer tokens, basic auth) if required
Identified endpoint URL and required headers/parameters

Configuration Steps

Navigate to AI Team > Connectors in the Edge Delta application
Find the HTTP Pull connector in Streaming Connectors
Click the connector card
Configure the Endpoint URL
Set the Method (GET, POST, etc.)
Add Headers for authentication if needed
Optionally add Query Parameters
Configure Advanced Settings for dynamic expressions, pagination, or TLS
Select a target environment
Click Save

The connector now polls the endpoint at the specified interval and streams content.

HTTP Pull connector configuration showing endpoint, method, headers, and advanced settings

Configuration Options

Connector Name

Name to identify this HTTP Pull connector instance.

Endpoint

The URL of the endpoint to send HTTP requests to for pulling data. Must be a valid HTTP or HTTPS URL.

Format: Complete URL with protocol

Examples:

https://api.example.com/v1/metrics
http://service.local:8080/health
https://status.service.com/api/v2/status.json

Note: Use Endpoint for static URLs, or Endpoint Expression (Advanced Settings) for dynamic URLs.

Method

The HTTP method to use for requests to the endpoint.

Values: GET, POST, PUT, DELETE

Default: GET

Headers

Headers attached to HTTP requests. Specified as key-value pairs.

Format: Header name and value

Examples:

Authorization: Bearer YOUR_API_TOKEN
X-API-Key: your-api-key-here
Accept: application/json
Content-Type: application/json

Query Parameters

Query parameters attached to HTTP requests. Automatically URL-encoded and appended to the endpoint URL.

Format: Parameter name and value

Examples:

format: json
limit: 100
page: 1
include_metadata: true

Advanced Settings

Endpoint Expression

OTTL expression for dynamic endpoint evaluation. Takes precedence over static Endpoint field.

Format: OTTL expression (single-line)

Examples:

Concat(["https://api.example.com/data?since=", UnixSeconds(Now() - Duration("1h"))], "")
Concat(["https://", EDXEnv("API_HOST", "api.example.com"), "/v1/logs"], "")

Use Cases:

Environment-specific endpoints (dev/staging/prod)
Time-based query parameters
Dynamic API versioning

Header Expressions

OTTL expressions for dynamic header values. These override static headers with the same name.

Format: Header name and OTTL value expression

Examples:

Authorization: Concat(["Bearer ", EDXEnv("API_TOKEN", "")], "")
X-Request-ID: Concat(["req-", String(UnixMilli(Now()))], "")

Parameter Expressions

OTTL expressions for dynamic query parameter values. These override static parameters with the same name.

Format: Parameter name and OTTL value expression

Examples:

start_time: FormatTime(Now() - Duration("1h"), "%Y-%m-%dT%H:%M:%SZ")
end_time: FormatTime(Now(), "%Y-%m-%dT%H:%M:%SZ")
since: String(UnixSeconds(Now() - Duration("24h")))

Common OTTL Functions:

Now() - Current timestamp
Duration("1h") - Parse duration
FormatTime() - Format timestamp
UnixSeconds() - Unix timestamp in seconds
UnixMilli() - Unix timestamp in milliseconds
EDXEnv() - Get environment variable
Concat() - Concatenate strings

For complete OTTL reference, see HTTP Pull Input Node documentation.

Pull Interval

The interval at which the agent sends HTTP requests to pull data from the endpoint.

Format: Duration in milliseconds

Default: 1 minute (60000 ms)

Examples:

30000 - 30 seconds (high-frequency monitoring)
60000 - 1 minute (default)
300000 - 5 minutes (standard logging)
3600000 - 1 hour (batch processing)

Request Timeout

Maximum duration to wait for a request to complete.

Format: Duration in milliseconds

Default: No timeout

Example: 30000 - 30 second timeout

Retry HTTP Code

Additional HTTP status codes that trigger request retry.

Format: Array of HTTP status codes

Examples:

409 - Conflict
429 - Too Many Requests
503 - Service Unavailable
502 - Bad Gateway

Pagination

Configuration for pagination and URL following. When set, enables extraction and parallel fetching of URLs from API responses.

URL JSON Path: JSON path expression to extract URLs from JSON responses (e.g., $.links[*].href). Only used when response_format is json.

Response Format: Format for parsing followed URL responses: json, text, or binary.

Max Parallel Requests: Maximum number of parallel requests when following URLs.

Inherit Authentication: Whether to use same authentication for followed URLs.

Error Strategy: How to handle errors when following URLs. continue skips failed URLs, stop halts processing.

Follow Link Header: Whether to follow URLs from Link headers in HTTP responses.

Link Relation: Link relation type to follow when using Link headers (e.g., next, prev).

Example JSON Pagination:

pagination:
  url_json_path: "next_page_url"
  response_format: json
  max_parallel: 5
  inherit_auth: true
  error_strategy: continue

Example Link Header Pagination:

pagination:
  follow_link_header: true
  link_relation: next
  max_parallel: 3

For detailed pagination configuration and examples, see HTTP Pull Input Node Pagination.

TLS

Optional TLS/SSL configuration for secure connections.

Configuration Options:

Enable TLS: Enables SSL/TLS connection
Ignore Certificate Check: Disables SSL/TLS certificate verification. Use with caution in testing environments only.
CA File: Absolute file path to the CA certificate for SSL/TLS connections
CA Path: Absolute path where CA certificate files are located
CRT File: Absolute path to the SSL/TLS certificate file
Key File: Absolute path to the private key file
Key Password: Optional password for the key file
Client Auth Type: Client authentication type. Default is noclientcert.
Minimum Version: Minimum TLS version. Default is TLSv1_2.
Maximum Version: Maximum TLS version allowed for connections

Metadata Level

This option is used to define which detected resources and attributes to add to each data item as it is ingested by Edge Delta. You can select:

Required Only: This option includes the minimum required resources and attributes for Edge Delta to operate.
Default: This option includes the required resources and attributes plus those selected by Edge Delta
High: This option includes the required resources and attributes along with a larger selection of common optional fields.
Custom: With this option selected, you can choose which attributes and resources to include. The required fields are selected by default and can’t be unchecked.

Based on your selection in the GUI, the source_metadata YAML is populated as two dictionaries (resource_attributes and attributes) with Boolean values.

See Choose Data Item Metadata for more information on selecting metadata.

HTTP Pull-specific metadata included:

http.url - Request URL
http.response.status_code - HTTP response status

Rate Limit

The rate_limit parameter enables you to control data ingestion based on system resource usage. This advanced setting helps prevent source nodes from overwhelming the agent by automatically throttling or stopping data collection when CPU or memory thresholds are exceeded.

Use rate limiting to prevent runaway log collection from overwhelming the agent in high-volume sources, protect agent stability in resource-constrained environments with limited CPU/memory, automatically throttle during bursty traffic patterns, and ensure fair resource allocation across source nodes in multi-tenant deployments.

When rate limiting triggers, pull-based sources (File, S3, HTTP Pull) stop fetching new data, push-based sources (HTTP, TCP, UDP, OTLP) reject incoming data, and stream-based sources (Kafka, Pub/Sub) pause consumption. Rate limiting operates at the source node level, where each source with rate limiting enabled independently monitors and enforces its own thresholds.

Configuration Steps:

Click Add New in the Rate Limit section
Click Add New for Evaluation Policy
Select Policy Type:

CPU Usage: Monitors CPU consumption and rate limits when usage exceeds defined thresholds. Use for CPU-intensive sources like file parsing or complex transformations.
Memory Usage: Monitors memory consumption and rate limits when usage exceeds defined thresholds. Use for memory-intensive sources like large message buffers or caching.
AND (composite): Combines multiple sub-policies with AND logic. All sub-policies must be true simultaneously to trigger rate limiting. Use when you want conservative rate limiting (both CPU and memory must be high).
OR (composite): Combines multiple sub-policies with OR logic. Any sub-policy can trigger rate limiting. Use when you want aggressive rate limiting (either CPU or memory being high triggers).

Select Evaluation Mode. Choose how the policy behaves when thresholds are exceeded:

Enforce (default): Actively applies rate limiting when thresholds are met. Pull-based sources (File, S3, HTTP Pull) stop fetching new data, push-based sources (HTTP, TCP, UDP, OTLP) reject incoming data, and stream-based sources (Kafka, Pub/Sub) pause consumption. Use in production to protect agent resources.
Monitor: Logs when rate limiting would occur without actually limiting data flow. Use for testing thresholds before enforcing them in production.
Passthrough: Disables rate limiting entirely while keeping the configuration in place. Use to temporarily disable rate limiting without removing configuration.

Set Absolute Limits and Relative Limits (for CPU Usage and Memory Usage policies)

Note: If you specify both absolute and relative limits, the system evaluates both conditions and rate limiting triggers when either condition is met (OR logic). For example, if you set absolute limit to 1.0 CPU cores and relative limit to 50%, rate limiting triggers when the source uses either 1 full core OR 50% of available CPU, whichever happens first.

For CPU Absolute Limits: Enter value in full core units:
- 0.1 = one-tenth of a CPU core
- 0.5 = half a CPU core
- 1.0 = one full CPU core
- 2.0 = two full CPU cores
For CPU Relative Limits: Enter percentage of total available CPU (0-100):
- 50 = 50% of available CPU
- 75 = 75% of available CPU
- 85 = 85% of available CPU
For Memory Absolute Limits: Enter value in bytes
- 104857600 = 100Mi (100 × 1024 × 1024)
- 536870912 = 512Mi (512 × 1024 × 1024)
- 1073741824 = 1Gi (1 × 1024 × 1024 × 1024)
For Memory Relative Limits: Enter percentage of total available memory (0-100)
- 60 = 60% of available memory
- 75 = 75% of available memory
- 80 = 80% of available memory

Set Refresh Interval (for CPU Usage and Memory Usage policies). Specify how frequently the system checks resource usage:

Recommended Values:
- 10s to 30s for most use cases
- 5s to 10s for high-volume sources requiring quick response
- 1m or higher for stable, low-volume sources

The system fetches current CPU/memory usage at the specified refresh interval and uses that value for evaluation until the next refresh. Shorter intervals provide more responsive rate limiting but incur slightly higher overhead, while longer intervals are more efficient but slower to react to sudden resource spikes.

The GUI generates YAML as follows:

# Simple CPU-based rate limiting
nodes:
  - name: <node name>
    type: <node type>
    rate_limit:
      evaluation_policy:
        policy_type: cpu_usage
        evaluation_mode: enforce
        absolute_limit: 0.5  # Limit to half a CPU core
        refresh_interval: 10s

# Simple memory-based rate limiting
nodes:
  - name: <node name>
    type: <node type>
    rate_limit:
      evaluation_policy:
        policy_type: memory_usage
        evaluation_mode: enforce
        absolute_limit: 536870912  # 512Mi in bytes
        refresh_interval: 30s

Composite Policies (AND / OR)

When using AND or OR policy types, you define sub-policies instead of limits. Sub-policies must be siblings (at the same level)—do not nest sub-policies within other sub-policies. Each sub-policy is independently evaluated, and the parent policy’s evaluation mode applies to the composite result.

AND Logic: All sub-policies must evaluate to true at the same time to trigger rate limiting. Use when you want conservative rate limiting (limit only when CPU AND memory are both high).
OR Logic: Any sub-policy evaluating to true triggers rate limiting. Use when you want aggressive protection (limit when either CPU OR memory is high).

Configuration Steps:

Select AND (composite) or OR (composite) as the Policy Type
Choose the Evaluation Mode (typically Enforce)
Click Add New under Sub-Policies to add the first condition
Configure the first sub-policy by selecting policy type (CPU Usage or Memory Usage), selecting evaluation mode, setting absolute and/or relative limits, and setting refresh interval
In the parent policy (not within the child), click Add New again to add a sibling sub-policy
Configure additional sub-policies following the same pattern

The GUI generates YAML as follows:

# AND composite policy - both CPU AND memory must exceed limits
nodes:
  - name: <node name>
    type: <node type>
    rate_limit:
      evaluation_policy:
        policy_type: and
        evaluation_mode: enforce
        sub_policies:
          # First sub-policy (sibling)
          - policy_type: cpu_usage
            evaluation_mode: enforce
            absolute_limit: 0.75  # Limit to 75% of one core
            refresh_interval: 15s
          # Second sub-policy (sibling)
          - policy_type: memory_usage
            evaluation_mode: enforce
            absolute_limit: 1073741824  # 1Gi in bytes
            refresh_interval: 15s

# OR composite policy - either CPU OR memory can trigger
nodes:
  - name: <node name>
    type: <node type>
    rate_limit:
      evaluation_policy:
        policy_type: or
        evaluation_mode: enforce
        sub_policies:
          - policy_type: cpu_usage
            evaluation_mode: enforce
            relative_limit: 85  # 85% of available CPU
            refresh_interval: 20s
          - policy_type: memory_usage
            evaluation_mode: enforce
            relative_limit: 80  # 80% of available memory
            refresh_interval: 20s

# Monitor mode for testing thresholds
nodes:
  - name: <node name>
    type: <node type>
    rate_limit:
      evaluation_policy:
        policy_type: memory_usage
        evaluation_mode: monitor  # Only logs, doesn't limit
        relative_limit: 70  # Test at 70% before enforcing
        refresh_interval: 30s

Target Environments

Select the Edge Delta pipeline (environment) where you want to deploy this connector.

How to Use the HTTP Pull Connector

The HTTP Pull connector integrates seamlessly with AI Team, enabling analysis of data from external APIs. AI teammates automatically leverage the ingested data based on the queries they receive and the context of the conversation.

Use Case: Third-Party Service Status Monitoring

Poll public status APIs to track service health and availability of external dependencies like payment processors or cloud providers. AI teammates analyze trends, detect status changes, component degradations, and service outages as soon as they appear. When combined with PagerDuty alerts, teammates automatically query recent status data during incident investigation to determine if external service issues are contributing to problems.

Configuration: Configure endpoint to status API (e.g., https://status.stripe.com/api/v2/status.json), add Accept header, set appropriate pull interval.

Use Case: Microservice Health Check Polling

Monitor internal microservices by polling health check endpoints to track service health, database connectivity, cache status, and resource availability. AI teammates detect unhealthy components and provide diagnostic recommendations. This is valuable when investigating performance issues—teammates can correlate health check failures with error spikes in logs and identify which dependencies are causing problems.

Configuration: Configure endpoint to internal health endpoint with authentication header, set shorter pull interval for critical services (30s-1m).

Use Case: Prometheus Metrics Collection

Query specific metrics from Prometheus via HTTP API and ingest into Edge Delta for AI-powered analysis. AI teammates correlate Prometheus metrics with log data and other telemetry for comprehensive observability. When combined with Jira integration, teammates can automatically document capacity issues by querying metric trends and creating detailed tickets with historical data.

Configuration: Use endpoint http://prometheus:9090/api/v1/query with query parameter for PromQL expression, configure pull interval based on metric granularity needs.

Troubleshooting

Connection timeout errors: Verify endpoint URL is correct and reachable. Test connection using curl from agent host. Check firewall rules allow outbound traffic.

401/403 authentication errors: Verify authentication headers are correctly formatted. Check API token is valid and not expired. Ensure token has necessary permissions for the endpoint.

404 Not Found errors: Verify endpoint URL matches API documentation exactly, including API version. Check query parameters are properly formatted and match expected names.

429 Too Many Requests: Reduce polling frequency by increasing pull interval. Check API provider’s rate limits and adjust accordingly. Implement exponential backoff if needed.

Invalid JSON response errors: Verify endpoint returns JSON by checking Content-Type header. Review API documentation for expected response format. Use debugging tools to inspect raw response.

Slow response times: Use query parameters to filter data at source. Implement pagination if API supports it. Check timeout settings provide enough time for legitimate responses.

Pagination not working: Confirm API returns pagination information (Link header or JSON field). Verify json path or link relation configuration matches API response structure. Check debug logs for pagination attempts.

OTTL expression errors: Ensure expressions are single-line (multi-line fails validation). Test expressions in development before production deployment. Provide fallback values in EDXEnv() calls.

Next Steps

Learn about HTTP Pull Input Node for detailed OTTL examples and pagination strategies
Explore vendor-specific integrations for GitHub, Microsoft Graph, and other APIs
Learn about creating custom teammates that can use HTTP Pull data

For additional help, visit AI Team Support.

HTTP Pull Connector

Overview

Add the HTTP Pull Connector

Prerequisites

Configuration Steps

Configuration Options

Connector Name

Endpoint

Method

Headers

Query Parameters

Advanced Settings

Endpoint Expression

Header Expressions

Parameter Expressions

Pull Interval

Request Timeout

Retry HTTP Code

Pagination

TLS

Metadata Level

Rate Limit

Composite Policies (AND / OR)

Target Environments

How to Use the HTTP Pull Connector

Use Case: Third-Party Service Status Monitoring

Use Case: Microservice Health Check Polling

Use Case: Prometheus Metrics Collection

Troubleshooting

Next Steps

Edge Delta AI Assistant

Quick Topics

Recent Questions

Hi! I'm your Edge Delta AI Assistant

Current Context