Edge Delta HTTP Pull Source

Configure HTTP Pull sources to periodically fetch data from HTTP endpoints with support for OTTL expressions, pagination, and dynamic parameters.

Overview

The HTTP Pull node enables the Edge Delta agent to periodically send HTTP requests to an endpoint to pull data. This source is ideal for retrieving logs or telemetry data from HTTP-based APIs, with support for dynamic configuration through OTTL expressions and automatic pagination handling.

Key Features:

  • Static and dynamic endpoint configuration
  • OTTL expression support for runtime values
  • Automatic pagination for multi-page responses
  • Flexible authentication and header management
  • Configurable retry logic
  • Outgoing data types: log
HTTP Pull Basic Configuration Form
HTTP Pull Input Source configuration form showing Endpoint and Method fields

Critical Requirements

Quick Reference

Parameter Type Required? Brief Description
name string Required Unique identifier for the HTTP pull node
type string Required Must be set to http_pull_input
endpoint string Required¹ Target URL for HTTP requests
method string Required HTTP method (GET or POST)
pull_interval duration Optional Frequency of requests (default: 1m)
headers array Optional Static HTTP headers as key-value pairs
parameters array Optional Static query parameters
endpoint_expression string Optional¹ Dynamic endpoint using OTTL expressions
header_expressions array[object] Optional Dynamic headers using OTTL expressions
parameter_expressions array[object] Optional Dynamic query parameters using OTTL
retry_http_code array[int] Optional HTTP codes that trigger retry
authentication object Optional Basic authentication configuration
authentication.strategy string Optional Authentication strategy
authentication.username string Optional Username for basic auth
authentication.password string Optional Password for basic auth
authorization object Optional OAuth/Bearer token configuration
authorization.strategy string Optional Auth strategy (e.g., oauth_client_credentials)
authorization.client_credentials object Optional OAuth client credentials config
tls object Optional TLS/SSL configuration
pagination object Optional Automatic pagination configuration
pagination.url_json_path string Optional² Dot notation path for next page URL
pagination.response_format string Optional Response format: json, text, binary
pagination.max_parallel int Optional Max concurrent pagination requests (default: 5)
pagination.inherit_auth boolean Optional Use same auth for followed URLs
pagination.error_strategy string Optional Error handling: continue or stop
pagination.follow_link_header boolean Optional² Follow RFC 5988 Link headers
pagination.link_relation string Optional² Link header relation to follow (default: next)
pagination.inherit_headers string/array Optional Header inheritance configuration
pagination.same_origin_only boolean Optional Enforce same-origin for inheritance
source_metadata object Optional Additional source metadata configuration

Notes:

  • ¹ Either endpoint or endpoint_expression is required, not both
  • ² Use either url_json_path or link_relation for pagination, not both
  • ³ For url_json_path, use dot notation; escape literal dots in field names with /.

Complete Configuration Example

nodes:
- name: http_pull_example
  type: http_pull_input

  # Required Parameters
  endpoint: "https://api.example.com/v1/data"  # OR use endpoint_expression below
  method: GET

  # Optional: Dynamic endpoint (alternative to static endpoint)
  # endpoint_expression: Concat(["https://", EDXEnv("API_HOST", "api.example.com"), "/v1/data"], "")
  # WARNING: OTTL expressions MUST be single-line (multi-line will fail validation)

  # Optional: Request frequency (default: 1m)
  pull_interval: 30s

  # Optional: Static headers
  headers:
    - header: Accept
      value: application/json
    - header: User-Agent
      value: EdgeDelta/1.0

  # Optional: Dynamic headers using OTTL expressions
  header_expressions:
    - header: Authorization
      value_expression: Concat(["Bearer ", EDXEnv("API_TOKEN", "")], "")
    - header: X-Request-ID
      value_expression: Concat(["req-", String(UnixMilli(Now()))], "")

  # Optional: Static query parameters
  parameters:
    - name: format
      value: json
    - name: limit
      value: "100"

  # Optional: Dynamic query parameters using OTTL expressions
  parameter_expressions:
    - name: since
      value_expression: FormatTime(Now() - Duration("1h"), "%Y-%m-%dT%H:%M:%SZ")
    - name: until
      value_expression: FormatTime(Now(), "%Y-%m-%dT%H:%M:%SZ")

  # Optional: Retry configuration
  retry_http_code:
    - 409  # Conflict
    - 429  # Too Many Requests
    - 503  # Service Unavailable

  # Optional: Pagination configuration
  pagination:
    # For JSON-based pagination (dot notation, NOT JSONPath):
    url_json_path: "next"  # Use "data/.next_page/.url" if field name contains dots
    # OR for Link header pagination:
    # link_relation: "next"
    max_parallel: 3

  # Optional: Source metadata
  source_metadata:
    tags:
      environment: production
      source: api

Basic Setup

Essential parameters to get started with HTTP Pull sources.

name

Type: string | Required: Yes

Unique identifier for this HTTP pull node within your Edge Delta configuration.

nodes:
- name: my_api_pull
  type: http_pull_input

type

Type: string | Required: Yes | Value: http_pull_input

Specifies the node type. Must always be set to http_pull_input for HTTP pull sources.

nodes:
- name: my_api_pull
  type: http_pull_input

endpoint

Type: string | Required: Yes (unless using endpoint_expression)

The URL to which HTTP requests are sent. Must be a valid URL with protocol (http/https).

nodes:
- name: my_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET

Common Use Cases:

  • Static API endpoints that don’t change
  • Development/test environments with fixed URLs
  • Simple data retrieval scenarios

method

Type: string | Required: Yes | Values: GET, POST

The HTTP method used for requests. Currently supports GET and POST methods.

nodes:
- name: my_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET

pull_interval

Type: duration | Required: No | Default: 1m

Frequency at which HTTP requests are sent to the endpoint. Accepts duration strings like 30s, 5m, 1h.

nodes:
- name: my_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  pull_interval: 30s  # Poll every 30 seconds

Common Use Cases:

  • High-frequency monitoring: 10s to 30s
  • Standard logging: 1m to 5m
  • Batch processing: 15m to 1h

headers

Type: array | Required: No

Static HTTP headers to include with each request. Specified as an array of header-value pairs.

nodes:
- name: my_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  headers:
    - header: Accept
      value: application/json
    - header: X-API-Version
      value: "2.0"
    - header: User-Agent
      value: EdgeDelta/1.0
HTTP Pull Headers Configuration
Static headers configuration for HTTP requests

Common Use Cases:

  • Content type specification
  • API versioning headers
  • Custom client identification

parameters

Type: array | Required: No

Static query parameters to append to the request URL. Specified as name-value pairs.

nodes:
- name: my_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  parameters:
    - name: format
      value: json
    - name: page_size
      value: "100"
    - name: include_metadata
      value: "true"
HTTP Pull Parameters Configuration
Static query parameters configuration for API requests

Common Use Cases:

  • Result formatting options
  • Page size configuration
  • Filter specifications

Dynamic Configuration (OTTL)

Configure runtime-evaluated parameters using OpenTelemetry Transformation Language (OTTL) expressions. The HTTP Pull input supports OTTL expressions through dedicated configuration fields, allowing you to use environment variables, timestamps, and other dynamic data without hardcoding values in your configuration.

OTTL Best Practices

  • Single-Line Format: All OTTL expressions must be written on a single line
  • Secure Credentials: Always use EDXEnv() for API tokens and sensitive values instead of hardcoding them
  • Use Dedicated Fields: Always use *_expression fields for OTTL expressions, not the regular fields
  • Time Windows: Use Duration() with Now() for relative time queries instead of absolute timestamps
  • Fallback Values: Provide meaningful fallback values in EDXEnv() calls for better error handling
  • Expression Testing: Test OTTL expressions in a development environment before deploying to production
  • Mix Static and Dynamic: You can use both static fields and expression fields in the same configuration

Common OTTL Converter Functions

Edge Delta supports all OpenTelemetry converter functions plus custom EDX extensions. Converters return values without modifying data:

Time Functions

Function Description Example
Now() Current timestamp Now()
Time() Parse time strings Time("2024-01-01")
FormatTime() Format timestamp FormatTime(Now(), "%Y-%m-%d")
Duration() Parse duration Duration("10m")
UnixSeconds() Unix timestamp (seconds) UnixSeconds(Now())
UnixMilli() Unix timestamp (milliseconds) UnixMilli(Now())
UnixMicro() Unix timestamp (microseconds) UnixMicro(Now())
UnixNano() Unix timestamp (nanoseconds) UnixNano(Now())

String Functions

Function Description Example
Concat() Concatenate strings Concat(["Bearer ", token], "")
Format() Printf-style formatting Format("%s-%d", ["prefix", 123])
String() Convert to string String(123)
Substring() Extract substring Substring("hello", 0, 2)
Split() Split string Split("a,b,c", ",")
Trim() Trim whitespace Trim(" text ")
ToLowerCase() Convert to lowercase ToLowerCase("HELLO")
ToUpperCase() Convert to uppercase ToUpperCase("hello")
ConvertCase() Convert string case ConvertCase("hello_world", "camel")

Type Conversion

Function Description Example
Int() Convert to integer Int("123")
Double() Convert to double Double("123.45")
ParseInt() Parse integer with base ParseInt("FF", 16)
Hex() Convert to hex string Hex(255)
Base64Decode() Decode base64 Base64Decode("aGVsbG8=")

EDX Custom Functions

Function Description Example
EDXEnv() Get environment variable EDXEnv("API_KEY", "default")
EDXEncode() Encode data EDXEncode("data", "base64", false)
EDXDecode() Decode data EDXDecode("aGVsbG8=", "base64")
EDXHmac() Generate HMAC EDXHmac(data, "secret", "sha256", "hex")
EDXIfElse() Conditional expression EDXIfElse(condition, "true_val", "false_val")
EDXRedis() Redis state management EDXRedis("GET", "last_pull_timestamp")

For the complete list of 100+ OTTL functions, see the OpenTelemetry documentation. For EDX-specific OTTL functions, see the EDX OTTL Functions Reference.

endpoint_expression

Type: string | Required: No (alternative to endpoint)

Dynamically construct the endpoint URL using OTTL expressions. Useful for environment-specific configurations.

nodes:
- name: dynamic_api_pull
  type: http_pull_input
  # Dynamic endpoint based on environment
  endpoint_expression: Concat(["https://", EDXEnv("API_HOST", "api.example.com"), "/v1/logs"], "")
  method: GET

Common Use Cases:

  • Environment-specific endpoints (dev/staging/prod)
  • Dynamic API versioning
  • Multi-tenant configurations

header_expressions

Type: array[object] | Required: No

Define HTTP headers using OTTL expressions that are evaluated at runtime.

nodes:
- name: secure_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  header_expressions:
    # Bearer token from environment
    - header: Authorization
      value_expression: Concat(["Bearer ", EDXEnv("API_TOKEN", "")], "")
    # Unique request ID with timestamp
    - header: X-Request-ID
      value_expression: Concat(["req-", String(UnixMilli(Now()))], "")
    # Dynamic version from environment
    - header: X-API-Version
      value_expression: EDXEnv("API_VERSION", "2.0")
HTTP Pull Header Expressions Configuration
Dynamic header expressions using OTTL for runtime evaluation

Common Use Cases:

  • Secure credential management
  • Request tracking and correlation
  • Dynamic authentication tokens
  • Time-based headers

parameter_expressions

Type: array[object] | Required: No

Define query parameters using OTTL expressions for dynamic values.

nodes:
- name: time_window_api
  type: http_pull_input
  endpoint: https://api.example.com/events
  method: GET
  parameter_expressions:
    # ISO 8601 timestamps for time windows
    - name: start_time
      value_expression: FormatTime(Now() - Duration("1h"), "%Y-%m-%dT%H:%M:%SZ")
    - name: end_time
      value_expression: FormatTime(Now(), "%Y-%m-%dT%H:%M:%SZ")
    # Unix timestamps in seconds (common for many APIs)
    - name: from_unix_sec
      value_expression: String(UnixSeconds(Now() - Duration("24h")))
    - name: to_unix_sec
      value_expression: String(UnixSeconds(Now()))
    # Unix timestamps in milliseconds (JavaScript-style)
    - name: from_unix_ms
      value_expression: String(UnixMilli(Now() - Duration("24h")))
    - name: to_unix_ms
      value_expression: String(UnixMilli(Now()))
    # Dynamic limit from environment
    - name: limit
      value_expression: EDXEnv("PULL_LIMIT", "100")
HTTP Pull Parameter Expressions Configuration
Dynamic parameter expressions using OTTL for runtime evaluation

Common Use Cases:

  • Time-windowed queries
  • Pagination tokens
  • Dynamic filtering
  • Environment-based configuration

Unix Timestamp Parameters

Many APIs require Unix timestamps for time-based queries. Here’s how to generate both seconds and milliseconds precision:

nodes:
- name: api_with_unix_time
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  parameter_expressions:
    # Unix timestamp in seconds - last 24 hours
    - name: start_time
      value_expression: String(UnixSeconds(Now() - Duration("24h")))
    # Unix timestamp in milliseconds - current time
    - name: end_time
      value_expression: String(UnixMilli(Now()))
    # Alternative: specific date in Unix seconds
    - name: since_date
      value_expression: String(UnixSeconds(Time("2024-01-01")))
  pull_interval: 1h

This creates a rolling 24-hour window that advances with each hourly pull, ensuring continuous coverage of log data without gaps or excessive duplication.

Migration from v2 Syntax

If migrating from v2 configurations using {{ Env }} syntax:

Old v2 Syntax:

endpoint: "https://{{ Env \"API_HOST\" \"api.example.com\" }}/data"
headers:
  - header: Authorization
    value: "Bearer {{ Env \"TOKEN\" \"default\" }}"

New OTTL Syntax:

endpoint_expression: Concat(["https://", EDXEnv("API_HOST", "api.example.com"), "/data"], "")
header_expressions:
  - header: Authorization
    value_expression: Concat(["Bearer ", EDXEnv("TOKEN", "default")], "")

Advanced Features

Configure pagination, retry logic, and other advanced capabilities.

retry_http_code

Type: array[int] | Required: No

Additional HTTP status codes that should trigger request retry. The agent has default retry logic for common errors, but you can extend it with custom codes.

nodes:
- name: resilient_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  retry_http_code:
    - 409  # Conflict
    - 429  # Too Many Requests
    - 502  # Bad Gateway
    - 503  # Service Unavailable

Common Use Cases:

  • Rate limit handling (429)
  • Temporary server issues (503)
  • Gateway timeouts (504)
  • Custom application errors

pagination

Type: object | Required: No

Configure automatic pagination for APIs that return data across multiple pages. The HTTP Pull input provides intelligent pagination support that automatically discovers and fetches all available data.

How Pagination Works:

  1. Initial Request: The first request is made to the configured endpoint
  2. Page Discovery: The response is checked for pagination information (Link header or JSON field)
  3. Concurrent Fetching: Additional pages are fetched concurrently (up to max_parallel)
  4. Completion: Pagination stops when no more pages are found or on error
  5. Data Processing: All retrieved data is processed and forwarded as logs

The agent can fetch multiple pages simultaneously for faster data retrieval while respecting API rate limits through the max_parallel parameter.

nodes:
- name: paginated_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  pagination:
    # Choose one pagination method:
    url_json_path: "next_page_url"  # For JSON-based (dot notation)
    # OR
    # link_relation: "next"  # For Link header
    max_parallel: 5
HTTP Pull Pagination Configuration
Pagination settings for automatic page discovery and concurrent fetching

pagination.url_json_path

Type: string | Required: No

Dot notation path to extract the next page URL from the response body. Important: This uses dot notation, NOT JSONPath syntax (no $ prefix).

pagination:
  url_json_path: "next"  # Simple next URL field
  # Common patterns:
  # Microsoft Graph: "@odata/.nextLink"  # Escapes dot in @odata.nextLink
  # Nested object: "meta.pagination.next_url"
  # Field with dots: "data/.next_page/.url"  # Escapes dots in field names

Dot Notation Examples:

  • Simple field: "next_url"
  • Nested object: "pagination.next"
  • Field with dot in name: "response/.data" (for field response.data)
  • Deep nested with dots: "api/.response/.next/.page" (for api.response.next.page)

Real-World API Examples:

// API Response with field names containing dots
{
  "data.next_page.url": "https://api.example.com/page2",
  "meta": {
    "page.info": {
      "next.link": "https://api.example.com/page2"
    }
  }
}
# Correct Edge Delta configuration
pagination:
  # For "data.next_page.url" field (top level)
  url_json_path: "data/.next_page/.url"
  # For nested "meta.page.info.next.link" field
  # url_json_path: "meta.page/.info.next/.link"

Common Use Cases:

  • REST APIs with JSON pagination
  • Microsoft Graph API
  • Custom pagination schemes

Type: string | Required: No

Link relation to follow from RFC 5988 Link headers.

pagination:
  link_relation: "next"

Example Link Header:

Link: <https://api.example.com/logs?page=2>; rel="next",
      <https://api.example.com/logs?page=10>; rel="last"

Common Use Cases:

  • GitHub API
  • GitLab API
  • Standards-compliant REST APIs

pagination.max_parallel

Type: int | Required: No | Default: 5

Maximum number of concurrent requests when fetching additional pages.

pagination:
  link_relation: "next"
  max_parallel: 3  # Reduce for rate-limited APIs

Common Use Cases:

  • Rate-limited APIs: Use 1-3
  • High-performance APIs: Use 5-10
  • Internal APIs: Use 10+

source_metadata

Type: object | Required: No

Additional metadata configuration for the source. Refer to Edge Delta documentation for specific metadata options.

nodes:
- name: tagged_api_pull
  type: http_pull_input
  endpoint: https://api.example.com/logs
  method: GET
  source_metadata:
    tags:
      environment: production
      service: api-gateway
      region: us-west-2

Vendor-Specific Examples

Ready to integrate with specific platforms? These guides provide tailored configurations and best practices for popular APIs:

Troubleshooting

Common Issues

No data retrieved:

  • Verify endpoint URL is correct and accessible
  • Check authentication credentials in environment variables
  • Ensure proper headers are set (Accept, Content-Type)
  • Test the endpoint with curl first (see Testing an Endpoint section below)

Pagination not working:

  • Confirm the API returns pagination information
  • Verify JSONPath or Link header configuration
  • Check debug logs for pagination attempts
  • Look for messages like “Following pagination URL” in logs
  • Verify the API response contains expected Link header or JSON field

Rate limiting errors:

  • Reduce max_parallel for pagination
  • Increase pull_interval to reduce request frequency
  • Add rate limit status codes to retry_http_code
  • Consider implementing exponential backoff

Authentication failures:

  • Verify environment variables are set correctly
  • Check token expiration and refresh logic
  • Ensure proper header format (Bearer, Basic, etc.)
  • Verify tokens aren’t request-specific

Infinite loops:

  • The agent automatically detects circular pagination
  • Check API documentation for proper pagination handling
  • Verify next page URLs are different from current page

Debug Logging

Enable debug logging to troubleshoot HTTP Pull issues:

# In your agent configuration
log:
  level: debug

Debug logs will show:

  • Request URLs and headers (sensitive values masked)
  • Response status codes and headers
  • Pagination URLs being followed: "Following pagination URL: <url> (page N)"
  • Total pages retrieved: "Total pages retrieved: X"
  • Pagination errors: "Pagination error: <details>"
  • Retry attempts and reasons

Testing an Endpoint

Before configuring Edge Delta, test your endpoint to verify connectivity and response format.

Basic Testing

Shell (Linux/Mac):

# GET request with headers
curl -X GET "https://api.example.com/data?limit=10" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN"

# POST request with body
curl -X POST "https://api.example.com/data" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"key": "value"}'

# Test with query parameters
curl -X GET "https://api.example.com/logs?since=2024-01-01&limit=100" \
  -H "Authorization: Bearer YOUR_TOKEN"

PowerShell (Windows):

# GET request with headers
$headers = @{
    'Accept' = 'application/json'
    'Authorization' = 'Bearer YOUR_TOKEN'
}
Invoke-RestMethod -Uri 'https://api.example.com/data?limit=10' `
                  -Method GET `
                  -Headers $headers

# POST request with body
$body = @{
    key = 'value'
} | ConvertTo-Json

Invoke-RestMethod -Uri 'https://api.example.com/data' `
                  -Method POST `
                  -Headers $headers `
                  -Body $body `
                  -ContentType 'application/json'

# Test with query parameters
Invoke-RestMethod -Uri 'https://api.example.com/logs?since=2024-01-01&limit=100' `
                  -Method GET `
                  -Headers $headers

Testing Authentication

Shell (Linux/Mac):

# Bearer token
curl -H "Authorization: Bearer YOUR_TOKEN" \
     https://api.example.com/data

# Basic auth
curl -u username:password \
     https://api.example.com/data

# API key in header
curl -H "X-API-Key: YOUR_KEY" \
     https://api.example.com/data

PowerShell (Windows):

# Bearer token
$headers = @{
    'Authorization' = 'Bearer YOUR_TOKEN'
}
Invoke-RestMethod -Uri 'https://api.example.com/data' `
                  -Headers $headers

# Basic auth
$secpasswd = ConvertTo-SecureString "password" -AsPlainText -Force
$cred = New-Object System.Management.Automation.PSCredential("username", $secpasswd)
Invoke-RestMethod -Uri 'https://api.example.com/data' `
                  -Credential $cred

# API key in header
$headers = @{
    'X-API-Key' = 'YOUR_KEY'
}
Invoke-RestMethod -Uri 'https://api.example.com/data' `
                  -Headers $headers

Verifying Pagination

Shell (Linux/Mac):

# Check for Link headers (GitHub-style)
curl -I "https://api.github.com/orgs/edgedelta/repos?per_page=5" \
  -H "Authorization: Bearer YOUR_TOKEN"
# Look for: Link: <url>; rel="next"

# Check JSON pagination (look for next URL in response)
curl "https://api.example.com/data?page=1" | jq '.next_page_url'
# Note: Use dot notation in Edge Delta config ("next_page_url")

PowerShell (Windows):

# Check for Link headers (GitHub-style)
$response = Invoke-WebRequest -Uri 'https://api.github.com/orgs/edgedelta/repos?per_page=5' `
                             -Headers @{'Authorization'='Bearer YOUR_TOKEN'} `
                             -Method Head
$response.Headers['Link']
# Look for: Link: <url>; rel="next"

# Check JSON pagination (look for next URL in response)
$result = Invoke-RestMethod -Uri 'https://api.example.com/data?page=1'
$result.next_page_url
# Note: Use dot notation in Edge Delta config ("next_page_url")

Custom API Integration Examples

Configure HTTP Pull to retrieve data from various custom and third-party APIs demonstrating different authentication methods, pagination styles, and data formats.

Duo Admin API Integration

Configure HTTP Pull to retrieve audit logs from all Duo Security Admin API endpoints including authentication, administrator, telephony, activity, and offline enrollment logs.

GitHub API Integration

Configure HTTP Pull to retrieve data from GitHub API endpoints with authentication, pagination, and time-based filtering.

Microsoft Graph API Integration

Configure HTTP Pull to retrieve audit logs from all Microsoft Graph API endpoints including directory audits, sign-in logs, provisioning logs, and security alerts using OAuth2 client credentials authentication.

Netskope

Configure HTTP Pull to collect security events from Netskope SASE platform using REST API v2.