Edge Delta HTTP Source
14 minute read
Overview
The HTTP source node directly receives log data from applications that transmit logs over HTTP, which is a common method for centralized log collection, especially in microservice architectures and event-driven architectures.
Note: Customers with HTTP source, TCP source or UDP source nodes should not use or update to Agent version v0.1.97.
AI Team: Configure this source using the HTTPS connector for streamlined setup in AI Team.
- outgoing_data_types: log

Example Configuration
nodes:
- name: my_http_input
type: http_input
port: 3421
parse_mode: auto
read_timeout: 10s
included_paths:
- /v1/.*
authentication:
strategy: Bearer
secret: "testXYZ"
Required Parameters
name
A descriptive name for the node. This is the name that will appear in pipeline builder and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a -
and a space followed by the string. It is a required parameter for all nodes.
nodes:
- name: <node name>
type: <node type>
type: http_input
The type
parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.
nodes:
- name: <node name>
type: <node type>
port
Enter the port number that the http_input
type node should listen on. It is specified as an integer and is a required parameter.
nodes:
- name: <node name>
type: http_input
port: <port number>
Optional Parameters
included_paths
The included_paths
parameter is used to filter incoming traffic to only accept requests from specified paths. This helps reduce input noise and enables routing to specific HTTP input nodes. It is specified as one or more regex patterns and is an optional parameter. When not specified, the HTTP input will accept requests from all paths.
nodes:
- name: http_input
type: http_input
port: 3421
read_timeout: 10s
included_paths:
- /v1/.*
Cloud Fleet Configuration
When using HTTP input nodes in Cloud Fleets, the included_paths
parameter is critical for configuring multiple HTTP inputs:
Important Cloud Fleet Constraints:
- Cloud Fleets support port 80 for HTTP traffic and port 443 for HTTPS/TLS traffic
- Custom ports (e.g., 3421, 8080) cannot be used
- When configuring multiple HTTP inputs on the same port, you must use path filtering with
included_paths
to separate traffic - Without
included_paths
filters, all HTTP input nodes will receive all incoming data, causing duplication - If you only have one HTTP input node,
included_paths
is optional and the node will accept all paths
Multiple HTTP Inputs Example:
In this example, three HTTP input nodes all use the same port (80 for HTTP, but could be 443 for HTTPS/TLS) and each accepts different paths. The first node only accepts paths starting with /service-a/
, the second only accepts /service-b/
paths, and the third handles /metrics/
paths.
nodes:
- name: service_a_input
type: http_input
port: 80
included_paths:
- /service-a/.*
- name: service_b_input
type: http_input
port: 80
included_paths:
- /service-b/.*
- name: metrics_input
type: http_input
port: 80
included_paths:
- /metrics/.*
Path Filtering Notes:
- Patterns are regular expressions (e.g.,
/v1/.*
matches/v1/logs
,/v1/metrics
) - Each HTTP input node should have unique, non-overlapping path patterns
- Requests not matching any
included_paths
pattern will be rejected
For troubleshooting Cloud Fleet HTTP input issues, see the Cloud Pipelines Troubleshooting Guide.
parse_mode
The parse_mode
parameter controls how incoming HTTP request bodies are processed and split into individual log entries. This is particularly important when receiving JSON payloads that contain embedded newlines, which could previously cause a single JSON object to be incorrectly split into multiple log entries.
Available options:
auto
(default) - Automatically detects the format based on the Content-Type header and payload structurejson
- Explicitly parse as JSON objects, preserving embedded newlines within JSON stringslines
- Process as line-delimited logs (original behavior, splits on newlines)
When to use each mode:
auto
mode (recommended):
- Default behavior that handles most use cases automatically
- Detects JSON based on Content-Type header (
application/json
) - Falls back to line processing for non-JSON content
- Best choice when receiving mixed content types
json
mode:
- Use when you know all incoming data will be JSON
- Prevents JSON objects with embedded newlines from being split incorrectly
- Handles single JSON objects, JSON arrays, and NDJSON (newline-delimited JSON)
- Falls back to line processing if JSON parsing fails
lines
mode:
- Use when receiving plain text logs with one log entry per line
- Each newline character creates a separate log entry
- Original HTTP input behavior before parse_mode was introduced
nodes:
- name: http_input_json
type: http_input
port: 8080
parse_mode: json # Parse as JSON, preserving embedded newlines
Example - Problem Solved by JSON Mode:
Before parse_mode: json
, a JSON payload with embedded newlines would be incorrectly split:
// Input JSON with embedded newlines:
{"id": "123", "message": "Line 1\nLine 2\nLine 3", "level": "INFO"}
// Without parse_mode: json - INCORRECT (3 broken log entries):
{"id": "123", "message": "Line 1
Line 2
Line 3", "level": "INFO"}
// With parse_mode: json - CORRECT (1 complete log entry):
{"id": "123", "message": "Line 1\nLine 2\nLine 3", "level": "INFO"}
Content-Type Header Detection:
When using auto
mode, the HTTP input detects JSON based on the Content-Type header:
# This request will be processed as JSON (auto-detected):
curl -X POST \
-H "Content-Type: application/json" \
-d '{"id": "123", "msg": "Log with\nnewlines"}' \
http://localhost:8080/logs
# This request will be processed as lines:
curl -X POST \
-H "Content-Type: text/plain" \
-d 'Line 1
Line 2
Line 3' \
http://localhost:8080/logs
Backward Compatibility:
The default auto
mode maintains backward compatibility with existing configurations. No changes are required for deployments that don’t send JSON with embedded newlines.
authentication
The authentication
parameter defines the type of authentication. It is specified as a string. Basic
and Bearer
are supported. For Bearer
authentication you specify a secret token. For Basic
, you specify the username
and password
. Authentication is an optional parameter.
Bearer:
nodes:
- name: http_input
type: http_input
port: 3421
read_timeout: 10s
included_paths:
- /v1/.*
authentication:
strategy: Bearer
secret: "<your_bearer_token>"
Basic:
- name: my_http_input
type: http_input
port: 8080
included_paths:
- /v1/.*
authentication:
strategy: Basic
username: <username>
password: <password>
listen
The listen
parameter is used to specify the address to listen to for incoming traffic. It is specified as a string and it is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
listen: <host>
rate_limit
The rate_limit
parameter enables you to control data ingestion based on system resource usage. This advanced setting helps prevent source nodes from overwhelming the agent by automatically throttling or stopping data collection when CPU or memory thresholds are exceeded.
Use rate limiting to prevent runaway log collection from overwhelming the agent in high-volume sources, protect agent stability in resource-constrained environments with limited CPU/memory, automatically throttle during bursty traffic patterns, and ensure fair resource allocation across source nodes in multi-tenant deployments.
When rate limiting triggers, pull-based sources (File, S3, HTTP Pull) stop fetching new data, push-based sources (HTTP, TCP, UDP, OTLP) reject incoming data, and stream-based sources (Kafka, Pub/Sub) pause consumption. Rate limiting operates at the source node level, where each source with rate limiting enabled independently monitors and enforces its own thresholds.
Configuration Steps:
- Click Add New in the Rate Limit section
- Click Add New for Evaluation Policy
- Select Policy Type:
- CPU Usage: Monitors CPU consumption and rate limits when usage exceeds defined thresholds. Use for CPU-intensive sources like file parsing or complex transformations.
- Memory Usage: Monitors memory consumption and rate limits when usage exceeds defined thresholds. Use for memory-intensive sources like large message buffers or caching.
- AND (composite): Combines multiple sub-policies with AND logic. All sub-policies must be true simultaneously to trigger rate limiting. Use when you want conservative rate limiting (both CPU and memory must be high).
- OR (composite): Combines multiple sub-policies with OR logic. Any sub-policy can trigger rate limiting. Use when you want aggressive rate limiting (either CPU or memory being high triggers).
- Select Evaluation Mode. Choose how the policy behaves when thresholds are exceeded:
- Enforce (default): Actively applies rate limiting when thresholds are met. Pull-based sources (File, S3, HTTP Pull) stop fetching new data, push-based sources (HTTP, TCP, UDP, OTLP) reject incoming data, and stream-based sources (Kafka, Pub/Sub) pause consumption. Use in production to protect agent resources.
- Monitor: Logs when rate limiting would occur without actually limiting data flow. Use for testing thresholds before enforcing them in production.
- Passthrough: Disables rate limiting entirely while keeping the configuration in place. Use to temporarily disable rate limiting without removing configuration.
- Set Absolute Limits and Relative Limits (for CPU Usage and Memory Usage policies)
Note: If you specify both absolute and relative limits, the system evaluates both conditions and rate limiting triggers when either condition is met (OR logic). For example, if you set absolute limit to
1.0
CPU cores and relative limit to50%
, rate limiting triggers when the source uses either 1 full core OR 50% of available CPU, whichever happens first.
For CPU Absolute Limits: Enter value in full core units:
0.1
= one-tenth of a CPU core0.5
= half a CPU core1.0
= one full CPU core2.0
= two full CPU cores
For CPU Relative Limits: Enter percentage of total available CPU (0-100):
50
= 50% of available CPU75
= 75% of available CPU85
= 85% of available CPU
For Memory Absolute Limits: Enter value in bytes
104857600
= 100Mi (100 × 1024 × 1024)536870912
= 512Mi (512 × 1024 × 1024)1073741824
= 1Gi (1 × 1024 × 1024 × 1024)
For Memory Relative Limits: Enter percentage of total available memory (0-100)
60
= 60% of available memory75
= 75% of available memory80
= 80% of available memory
- Set Refresh Interval (for CPU Usage and Memory Usage policies). Specify how frequently the system checks resource usage:
- Recommended Values:
10s
to30s
for most use cases5s
to10s
for high-volume sources requiring quick response1m
or higher for stable, low-volume sources
The system fetches current CPU/memory usage at the specified refresh interval and uses that value for evaluation until the next refresh. Shorter intervals provide more responsive rate limiting but incur slightly higher overhead, while longer intervals are more efficient but slower to react to sudden resource spikes.
The GUI generates YAML as follows:
# Simple CPU-based rate limiting
nodes:
- name: <node name>
type: <node type>
rate_limit:
evaluation_policy:
policy_type: cpu_usage
evaluation_mode: enforce
absolute_limit: 0.5 # Limit to half a CPU core
refresh_interval: 10s
# Simple memory-based rate limiting
nodes:
- name: <node name>
type: <node type>
rate_limit:
evaluation_policy:
policy_type: memory_usage
evaluation_mode: enforce
absolute_limit: 536870912 # 512Mi in bytes
refresh_interval: 30s
Composite Policies (AND / OR)
When using AND or OR policy types, you define sub-policies instead of limits. Sub-policies must be siblings (at the same level)—do not nest sub-policies within other sub-policies. Each sub-policy is independently evaluated, and the parent policy’s evaluation mode applies to the composite result.
- AND Logic: All sub-policies must evaluate to true at the same time to trigger rate limiting. Use when you want conservative rate limiting (limit only when CPU AND memory are both high).
- OR Logic: Any sub-policy evaluating to true triggers rate limiting. Use when you want aggressive protection (limit when either CPU OR memory is high).
Configuration Steps:
- Select AND (composite) or OR (composite) as the Policy Type
- Choose the Evaluation Mode (typically Enforce)
- Click Add New under Sub-Policies to add the first condition
- Configure the first sub-policy by selecting policy type (CPU Usage or Memory Usage), selecting evaluation mode, setting absolute and/or relative limits, and setting refresh interval
- In the parent policy (not within the child), click Add New again to add a sibling sub-policy
- Configure additional sub-policies following the same pattern
The GUI generates YAML as follows:
# AND composite policy - both CPU AND memory must exceed limits
nodes:
- name: <node name>
type: <node type>
rate_limit:
evaluation_policy:
policy_type: and
evaluation_mode: enforce
sub_policies:
# First sub-policy (sibling)
- policy_type: cpu_usage
evaluation_mode: enforce
absolute_limit: 0.75 # Limit to 75% of one core
refresh_interval: 15s
# Second sub-policy (sibling)
- policy_type: memory_usage
evaluation_mode: enforce
absolute_limit: 1073741824 # 1Gi in bytes
refresh_interval: 15s
# OR composite policy - either CPU OR memory can trigger
nodes:
- name: <node name>
type: <node type>
rate_limit:
evaluation_policy:
policy_type: or
evaluation_mode: enforce
sub_policies:
- policy_type: cpu_usage
evaluation_mode: enforce
relative_limit: 85 # 85% of available CPU
refresh_interval: 20s
- policy_type: memory_usage
evaluation_mode: enforce
relative_limit: 80 # 80% of available memory
refresh_interval: 20s
# Monitor mode for testing thresholds
nodes:
- name: <node name>
type: <node type>
rate_limit:
evaluation_policy:
policy_type: memory_usage
evaluation_mode: monitor # Only logs, doesn't limit
relative_limit: 70 # Test at 70% before enforcing
refresh_interval: 30s
read_timeout
The read_timeout
parameter is used to specify how long to wait for incoming data. Default value is 0 which means no time out. It is specified as a duration and it is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
read_timeout: 10s
source_metadata
This option is used to define which detected resources and attributes to add to each data item as it is ingested by Edge Delta. You can select:
- Required Only: This option includes the minimum required resources and attributes for Edge Delta to operate.
- Default: This option includes the required resources and attributes plus those selected by Edge Delta
- High: This option includes the required resources and attributes along with a larger selection of common optional fields.
- Custom: With this option selected, you can choose which attributes and resources to include. The required fields are selected by default and can’t be unchecked.
Based on your selection in the GUI, the source_metadata
YAML is populated as two dictionaries (resource_attributes
and attributes
) with Boolean values.
See Choose Data Item Metadata for more information on selecting metadata.
tls
The tls
parameter is a dictionary type that enables a number of options to be set using sub-parameters.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
<tls options>:
ca_file
The ca_file
parameter is a child of the tls
parameter. It specifies the CA certificate file. It is specified as a string and is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
ca_file: /certs/ca.pem
ca_path
The ca_path
parameter is a child of the tls
parameter. It specifies the location of the CA certificate files. It is specified as a string and is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
ca_path: /var/etc/kafka
client_auth_type
The client_auth_type
parameter is a child of the tls
parameter. It specifies the authentication type to use for the connection. It is specified as a string from a closed list and is optional.
The following authentication methods are available:
- noclientcert indicates that no client certificate should be requested during the handshake, and if any certificates are sent they will not be verified.
- requestclientcert indicates that a client certificate should be requested during the handshake, but does not require that the client send any certificates.
- requireanyclientcert indicates that a client certificate should be requested during the handshake, and that at least one certificate is required from the client, but that certificate is not required to be valid.
- verifyclientcertifgiven indicates that a client certificate should be requested during the handshake, but does not require that the client sends a certificate. If the client does send a certificate it is required to be valid.
- requireandverifyclientcert indicates that a client certificate should be requested during the handshake, and that at least one valid certificate is required to be sent by the client
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
client_auth_type: <auth type>
crt_file
The crt_file
parameter is a child of the tls
parameter. It specifies the certificate file. It is specified as a string and is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
crt_file: /certs/server-cert.pem
ignore_certificate_check
The ignore_certificate_check
parameter is a child of the tls
parameter. When set to true
, it ignores certificate checks for the remote endpoint. It is specified as a Boolean value and the default is false
, indicating that TLS verification will be performed. This is an optional parameter.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
ignore_certificate_check: true
key_file
The key_file
parameter is a child of the tls
parameter. It specifies the key file. It is specified as a string and is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
key_file: /certs/server-key.pem
key_password
The key_password
parameter is a child of the tls
parameter. It specifies the key password. When the private key_file
location is provided, this file can also be provided to get the password of the private key. It is specified as a string and is optional.
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
key_password: <password>
max_version
The max_version
parameter is a child of the tls
parameter. It specifies the maximum version of TLS to accept. It is specified as a string and is optional.
You can select one of the following options:
TLSv1_0
TLSv1_1
TLSv1_2
TLSv1_3
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
max_version: <TLS version>
min_version
The min_version
parameter is a child of the tls
parameter. It specifies the minimum version of TLS to accept. It is specified as a string and is optional. The default is TLSv1_2
.
You can select one of the following options:
TLSv1_0
TLSv1_1
TLSv1_2
TLSv1_3
nodes:
- name: <node name>
type: http_input
port: <port number>
tls:
min_version: <TLS version>
Testing an Endpoint
The following command can be used to test the input:
curl -X POST -d '{"json":"my log"}' <host>:<port>/<path>
The path can be any valid URL path. If included_paths
is configured, the path must match one of the specified regex patterns. If included_paths
is not configured, any path will be accepted.