Edge Delta Elastic Destination

Configure the Edge Delta Elastic Destination to send logs to Elastic using the elastic_output node.

  15 minute read  

Overview

The Elastic destination node send items to an Elastic destination. It sends raw bytes that are generated via marshaling items as JSON. Before marshaling, the _type field is changed to __type and _timestamp is changed into @timestamp.

This node requires Edge Delta agent version v0.1.53 or higher.

Configuring Elastic

You need to configure Elastic to use it as a data destination in Edge Delta. To do this you create a lifecycle policy and an index template. Then you can update the Edge Delta Pipeline configuration to send data to Elastic.

Configure the Edge Delta Agent

Use pipeline builder or the agent YAML to configure the Elastic destination node.

Example Configuration

nodes:
  - name: my_elastic
    type: elastic_output
    index: <REDACTED>
    user: elastic
    password: <REDACTED>
    address:
      - <REDACTED>

Required Parameters

name

A descriptive name for the node. This is the name that will appear in pipeline builder and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a - and a space followed by the string. It is a required parameter for all nodes.

nodes:
  - name: <node name>
    type: <node type>

type: elastic_output

The type parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.

nodes:
  - name: <node name>
    type: <node type>

Optional Parameters

address

The address parameter specifies the address list for the Elastic backend. It is specified as a string. It is required unless cloud_id is specified.

nodes:
  - name: <node name>
    type: elastic_output
    address: <string>
    token: <token>

cloud_id

The cloud_id parameter specifies the authentication ID for the Elastic backend. It is specified as a string. It is required unless address is specified.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>

external_id

The external_id parameter is a unique identifier to avoid a confused deputy attack. It is specified as a string and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    external_id: <ID>

index

The index parameter defines which index the node should flush data into. It is specified as a string and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    index: <index>

index_expression

The index_expression parameter allows you to dynamically compute the Elastic index name based on data item attributes using an OTTL expression. When specified, this expression is evaluated for each data item and the result is used as the destination index, overriding the static index parameter. This enables flexible index routing based on log content, source, or any other attribute.

Dynamic index support requires Edge Delta agent version v2.8.0 or higher.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    index: default-logs
    index_expression: attributes["elastic_index"]

Example: Route logs to different indexes based on severity:

nodes:
  - name: elastic_dynamic
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    index: logs-default
    index_expression: attributes["log_level"]

In this example, logs with attributes["log_level"] = "error" would be routed to an index named error, while logs with attributes["log_level"] = "info" would go to info.

keep_overridden_index

The keep_overridden_index parameter specifies whether to retain the original index value in the data item after applying the index_expression. When set to true, the attribute used in the expression remains in the data. When set to false (default), the attribute is removed after being used for routing. It is specified as a Boolean and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    index: default-logs
    index_expression: attributes["elastic_index"]
    keep_overridden_index: true

parallel_worker_count

The parallel_worker_count parameter specifies the number of workers that run in parallel for sending data to Elastic. Increasing this value can improve throughput for high-volume data streams. It is specified as an integer, has a default of 5, and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>
    parallel_worker_count: 10

password

The password parameter specifies the password for authentication if user has been specified instead of a token. It is specified as a string and is optional.

This field supports secret references for secure credential management. Instead of hardcoding sensitive values, you can reference a secret configured in your pipeline.

To use a secret in the GUI:

  1. Create a secret in your pipeline’s Settings > Secrets section (see Using Secrets)
  2. In this field, select the secret name from the dropdown list that appears

To use a secret in YAML: Reference it using the syntax: '{{ SECRET secret-name }}'

Example:

field_name: '{{ SECRET my-credential }}'

Note: The secret reference must be enclosed in single quotes when using YAML. Secret values are encrypted at rest and resolved at runtime, ensuring no plaintext credentials appear in logs or API responses.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    user: <username>
    password: <password>

region

The region parameter specifies the region where the OpenSearch cluster is found. It is used with user and password. It is specified as a string and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    user: <username>
    password: <password>
    region: <region>

role_arn

The role_arn parameter is used if authentication and authorization is performed using an assumed AWS IAM role. It should consist of the account ID and role name. A role_arn is optional for a data destination depending on the access configuration.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    user: <username>
    password: <password>
    region: <region>
    role_arn: <role ARN>

tls

Configure TLS settings for secure connections to this destination. TLS is optional and typically used when connecting to endpoints that require encrypted transport (HTTPS) or mutual TLS.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      <tls options>

Enable TLS

Enables TLS encryption for outbound connections to the destination endpoint. When enabled, all communication with the destination will be encrypted using TLS/SSL. This should be enabled when connecting to HTTPS endpoints or any service that requires encrypted transport. (YAML parameter: enabled)

Default: false

When to use: Enable when the destination requires HTTPS or secure connections. Always enable for production systems handling sensitive data, connections over untrusted networks, or when compliance requirements mandate encryption in transit.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      enabled: true

Ignore Certificate Check

Disables TLS certificate verification, allowing connections to servers with self-signed, expired, or invalid certificates. This bypasses security checks that verify the server’s identity and certificate validity. (YAML parameter: ignore_certificate_check)

Default: false

When to use: Only use in development or testing environments with self-signed certificates. NEVER enable in production—this makes your connection vulnerable to man-in-the-middle attacks. For production with self-signed certificates, use ca_file or ca_path to explicitly trust specific certificates instead.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      ignore_certificate_check: true  # Only for testing!

CA Certificate File

Specifies the absolute path to a CA (Certificate Authority) certificate file used to verify the destination server’s certificate. This allows you to trust specific CAs beyond the system’s default trusted CAs, which is essential when connecting to servers using self-signed certificates or private CAs. (YAML parameter: ca_file)

When to use: Required when connecting to servers with certificates signed by a private/internal CA, or when you want to restrict trust to specific CAs only. Choose either ca_file (single CA certificate) or ca_path (directory of CA certificates), not both.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      ca_file: /certs/ca.pem

CA Certificate Path

Specifies a directory path containing one or more CA certificate files for verifying the destination server’s certificate. Use this when you need to trust multiple CAs or when managing CA certificates across multiple files. All certificate files in the directory will be loaded. (YAML parameter: ca_path)

When to use: Alternative to ca_file when you have multiple CA certificates to trust. Useful for environments with multiple private CAs or when you need to rotate CA certificates without modifying configuration. Choose either ca_file or ca_path, not both.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      ca_path: /certs/ca-certificates/

Certificate File

Path to the client certificate file (public key) used for mutual TLS (mTLS) authentication with the destination server. This certificate identifies the client to the server and must match the private key. The certificate should be in PEM format. (YAML parameter: crt_file)

When to use: Required only when the destination server requires mutual TLS authentication, where both client and server present certificates. Must be used together with key_file. Not needed for standard client TLS connections where only the server presents a certificate.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      crt_file: /certs/client-cert.pem
      key_file: /certs/client-key.pem

Private Key File

Path to the private key file corresponding to the client certificate. This key must match the public key in the certificate file and is used during the TLS handshake to prove ownership of the certificate. Keep this file secure with restricted permissions. (YAML parameter: key_file)

When to use: Required for mutual TLS authentication. Must be used together with crt_file. If the key file is encrypted with a password, also specify key_password. Only needed when the destination server requires client certificate authentication.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      crt_file: /certs/client-cert.pem
      key_file: /certs/client-key.pem
      key_password: <password>  # Only if key is encrypted

Private Key Password

Password (passphrase) used to decrypt an encrypted private key file. Only needed if your private key file is password-protected. If your key file is unencrypted, omit this parameter. (YAML parameter: key_password)

When to use: Optional. Only required if key_file is encrypted/password-protected. For enhanced security, use encrypted keys in production environments. If you receive “bad decrypt” or “incorrect password” errors, verify the password matches the key file encryption.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      crt_file: /certs/client-cert.pem
      key_file: /certs/encrypted-client-key.pem
      key_password: mySecurePassword123

Minimum TLS Version

Minimum TLS protocol version to use when connecting to the destination server. This enforces a baseline security level by refusing to connect if the server doesn’t support this version or higher. (YAML parameter: min_version)

Available versions:

  • TLSv1_0 - Deprecated, not recommended (security vulnerabilities)
  • TLSv1_1 - Deprecated, not recommended (security vulnerabilities)
  • TLSv1_2 - Recommended minimum for production (default)
  • TLSv1_3 - Most secure, use when destination supports it

Default: TLSv1_2

When to use: Set to TLSv1_2 or higher for production deployments. Only use TLSv1_0 or TLSv1_1 if connecting to legacy servers that don’t support newer versions, and be aware of the security risks. TLS 1.0 and 1.1 are officially deprecated.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      min_version: TLSv1_2

Maximum TLS Version

Maximum TLS protocol version to use when connecting to the destination server. This is typically used to restrict newer TLS versions if compatibility issues arise with specific server implementations. (YAML parameter: max_version)

Available versions:

  • TLSv1_0
  • TLSv1_1
  • TLSv1_2
  • TLSv1_3

When to use: Usually left unset to allow the most secure version available. Only set this if you encounter specific compatibility issues with TLS 1.3 on the destination server, or for testing purposes. In most cases, you should allow the latest TLS version.

YAML Configuration Example:

nodes:
  - name: <node name>
    type: <destination type>
    tls:
      max_version: TLSv1_3

token

The token parameter provides authentication to hosted elastic instances. It is used with the cloud_id parameter. It is written as a string. A token is optional for a data destination.

This field supports secret references for secure credential management. Instead of hardcoding sensitive values, you can reference a secret configured in your pipeline.

To use a secret in the GUI:

  1. Create a secret in your pipeline’s Settings > Secrets section (see Using Secrets)
  2. In this field, select the secret name from the dropdown list that appears

To use a secret in YAML: Reference it using the syntax: '{{ SECRET secret-name }}'

Example:

field_name: '{{ SECRET my-credential }}'

Note: The secret reference must be enclosed in single quotes when using YAML. Secret values are encrypted at rest and resolved at runtime, ensuring no plaintext credentials appear in logs or API responses.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    token: <token>

user

The user parameter specifies the username for authentication if password has been specified instead of a token. It is specified as a string and is optional.

nodes:
  - name: <node name>
    type: elastic_output
    cloud_id: <Cloud ID>
    user: <username>
    password: <password>

persistent_queue

The persistent_queue configuration enables disk-based buffering to prevent data loss during destination failures or slowdowns. When enabled, the agent stores data on disk and automatically retries delivery when the destination recovers.

Complete example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer
  mode: error
  max_byte_size: 1GB
  drain_rate_limit: 1000
  strict_ordering: true

How it works:

  1. Normal operation: Data flows directly to the destination (for error and backpressure modes) or through the buffer (for always mode)
  2. Destination failure: Data is written to disk at the configured path
  3. Recovery: When the destination becomes available, buffered data drains at the configured drain_rate_limit while new data continues flowing
  4. Completion: Buffer clears and normal operation resumes

Key benefits:

  • No data loss: Logs are preserved during destination outages
  • Automatic recovery: No manual intervention required
  • Configurable behavior: Choose when and how buffering occurs based on your needs

path

The path parameter specifies the directory where buffered data is stored on disk.

Example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer

Default value: /var/lib/edgedelta/outputbuffer

Requirements:

  • The directory must have sufficient disk space for the configured max_byte_size
  • The agent process must have read/write permissions to this location
  • The path should be on a persistent volume (not tmpfs or memory-backed filesystem)

Best practices:

  • Use dedicated storage for buffer data separate from logs
  • Monitor disk usage to prevent buffer from filling available space
  • Ensure the path persists across agent restarts to maintain buffered data

max_byte_size

The max_byte_size parameter sets the maximum disk space allocated for the persistent buffer. When the buffer reaches this limit, behavior depends on your configuration.

Example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer
  max_byte_size: 1GB

Sizing guidance:

  • Small deployments (1-10 logs/sec): 100MB - 500MB
  • Medium deployments (10-100 logs/sec): 500MB - 2GB
  • Large deployments (100+ logs/sec): 2GB - 10GB

Calculation example:

Average log size: 1KB
Expected outage duration: 1 hour
Log rate: 100 logs/sec

Buffer size = 1KB × 100 logs/sec × 3600 sec = 360MB
Recommended: 500MB - 1GB (with safety margin)

Important: Set this value based on your disk space availability and expected outage duration. The buffer will accumulate data during destination failures and drain when the destination recovers.

mode

The mode parameter determines when data is buffered to disk. Three modes are available:

  • error (default) - Buffers data only when the destination returns errors (connection failures, HTTP 5xx errors, timeouts). During healthy operation, data flows directly to the destination without buffering.

  • backpressure - Buffers data when the in-memory queue reaches 80% capacity OR when destination errors occur. This mode helps handle slow destinations that respond successfully but take longer than usual to process requests.

  • always - Uses write-ahead-log behavior where all data is written to disk before being sent to the destination. This provides maximum durability but adds disk I/O overhead to every operation.

Example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer
  mode: error
  max_byte_size: 1GB

When to use each mode:

  • Use error for most production deployments with reliable destinations
  • Use backpressure when destinations occasionally experience slowdowns but remain healthy
  • Use always for mission-critical data that must survive agent crashes or restarts

strict_ordering

The strict_ordering parameter ensures that buffered data is delivered in the exact order it was generated.

Example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer
  strict_ordering: true

Default value: true

Behavior:

  • true - Logs delivered in exact chronological order. Single-threaded processing (slower drain).
  • false - Logs may arrive out of order. Multi-threaded processing enabled (faster drain).

Important: When strict_ordering is true, the parallel_worker_count must be set to 1. Setting it to a higher value will cause configuration validation to fail.

When to use:

  • Use true (strict ordering) when:

    • Log sequence is critical (debugging, troubleshooting, audit trails)
    • Applications rely on temporal order of events
    • Compliance requirements mandate chronological delivery
  • Use false (no strict ordering) when:

    • Faster buffer drain is more important than order
    • Destination can handle out-of-order data
    • You need parallel processing for high-volume recovery

Performance impact: Disabling strict ordering can significantly speed up buffer drain by enabling parallel workers, but may result in logs arriving out of their original sequence.

drain_rate_limit

The drain_rate_limit parameter controls the maximum events per second (EPS) when draining the persistent buffer after a destination recovers from a failure.

Example:

persistent_queue:
  path: /var/lib/edgedelta/outputbuffer
  drain_rate_limit: 1000

Default value: 1000 EPS

Choosing the right rate:

  • Fast drain (1000-10000 EPS): Minimizes recovery time but may overwhelm the destination
  • Moderate drain (500-1000 EPS): Balanced approach for most use cases
  • Slow drain (100-500 EPS): Gentle recovery for sensitive destinations

Impact on recovery time:

Buffer size: 1GB
Average log size: 1KB
Total items: ~1,000,000 logs

At 1000 EPS: ~17 minutes to drain
At 5000 EPS: ~3.5 minutes to drain
At 100 EPS: ~2.8 hours to drain

Note: During drain, both current data and buffered data flow to the destination simultaneously. Set this value based on your destination’s capacity to handle additional load during recovery.

Troubleshooting Elastic

Time Format

Check that the correct time format has been configured or the timestamp will not be parsed correctly:

bulk add custom entry operation failed, error type: mapper_parsing_exception, reason: failed to parse field [timestamp] of type [date] in document

See the Elastic documentation for configurable time formats.

The template example Edge Delta provides uses the strict format. If you use the basic format, change the date format as follows:

"mappings": {
      "_meta": {},
      "_routing": {
        "required": false
      },
      "dynamic": true,
      "numeric_detection": false,
      "date_detection": true,
      "dynamic_date_formats": [
        "basic_date_time",
        "yyyyMMdd'T'HHmmss.SSSZ"

If you have multiple formats for timestamps in your logs, the basic_date_time may break those records that are being accepted with the strict format. In that instance, combine the formats with an OR operator to have Elastic accept multiple formats:

"mappings": {
      "_meta": {},
      "_routing": {
        "required": false
      },
      "dynamic": true,
      "numeric_detection": false,
      "date_detection": true,
      "dynamic_date_formats": [
        "basic_date_time",
        "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z||yyyyMMdd'T'HHmmss.SSSZ"