OTTL Cookbook

Practical OTTL recipes for common telemetry transformation tasks in Edge Delta, including data extraction, masking, enrichment, and reduction patterns.

Overview

This cookbook provides ready-to-use OTTL recipes for common telemetry transformation tasks. Each recipe includes complete input/output examples that you can test in the Edge Delta Visual Pipeline Builder.

When to Use OTTL vs. Built-in Processors

Edge Delta provides dedicated processors for many common tasks. Use OTTL when you need:

  • Complex multi-step transformations that combine parsing, conditional logic, and field manipulation
  • Custom regex patterns beyond what built-in processors offer
  • Cache-based workflows where intermediate values are needed
  • Chained transformations where output of one step feeds into another

For simpler single-purpose tasks, consider these built-in processors:

TaskBuilt-in Processor
Parse JSONParse JSON Processor
Parse key=valueParse Key-Value Processor
Extract with regexParse Regex Processor or Parse Grok Processor
Mask sensitive dataMask Processor
Enrich from tablesLookup Processor
Add static fieldsAdd Field Processor
Remove fieldsDelete Field Processor
Filter/drop logsFilter Processor

All recipes follow the standard OTTL pattern:

  1. Parse or access input data
  2. Transform using editor and converter functions
  3. Store results in attributes or update the body

Data Extraction Recipes

Extract IP Addresses from Log Body

Extract IPv4 addresses from unstructured log text using regex patterns.

Tip: For simpler regex extraction, consider the Parse Regex Processor or Parse Grok Processor which provide a visual interface.

Input

{
  "_type": "log",
  "body": "Connection from 192.168.1.45 to server 10.0.0.1:8080 established",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(attributes, ExtractPatterns(Decode(body, "utf-8"), "(?P<source_ip>\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\b.*(?P<dest_ip>\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\b"))

Output

{
  "_type": "log",
  "body": "Connection from 192.168.1.45 to server 10.0.0.1:8080 established",
  "resource": {},
  "attributes": {
    "source_ip": "192.168.1.45",
    "dest_ip": "10.0.0.1"
  },
  "timestamp": 1735833045000
}

The regex captures two named groups that become attribute fields.

Parse Key-Value Pairs from Logs

Extract structured data from key=value formatted logs.

Tip: The Parse Key-Value Processor handles this with configurable delimiters and no code required.

Input

{
  "_type": "log",
  "body": "user=admin action=login status=success duration=45ms",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["body_str"], Decode(body, "utf-8"))
set(cache["parsed"], ParseKeyValue(cache["body_str"], "=", " "))
merge_maps(attributes, cache["parsed"], "insert")

Output

{
  "_type": "log",
  "body": "user=admin action=login status=success duration=45ms",
  "resource": {},
  "attributes": {
    "user": "admin",
    "action": "login",
    "status": "success",
    "duration": "45ms"
  },
  "timestamp": 1735833045000
}

ParseKeyValue splits the log into a map using the specified delimiters.

Parse Apache/Nginx Access Logs

Extract fields from standard Apache Combined Log Format.

Tip: The Parse Grok Processor includes built-in patterns for Apache and Nginx logs. See also the Syslog Pack for syslog-wrapped access logs.

Input

{
  "_type": "log",
  "body": "192.168.1.100 - john [02/Jan/2025:15:30:45 +0000] \"GET /api/users HTTP/1.1\" 200 1234 \"https://example.com\" \"Mozilla/5.0\"",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["body"], Decode(body, "utf-8"))
set(attributes, ExtractPatterns(cache["body"], "(?P<client_ip>\\d+\\.\\d+\\.\\d+\\.\\d+)\\s+-\\s+(?P<user>\\S+)\\s+\\[(?P<timestamp>[^\\]]+)\\]\\s+\"(?P<method>\\w+)\\s+(?P<path>\\S+)\\s+(?P<protocol>[^\"]+)\"\\s+(?P<status>\\d+)\\s+(?P<bytes>\\d+)"))
set(attributes["status_code"], Int(attributes["status"]))
set(attributes["response_bytes"], Int(attributes["bytes"]))
delete_key(attributes, "status")
delete_key(attributes, "bytes")

Output

{
  "_type": "log",
  "body": "192.168.1.100 - john [02/Jan/2025:15:30:45 +0000] \"GET /api/users HTTP/1.1\" 200 1234 \"https://example.com\" \"Mozilla/5.0\"",
  "resource": {},
  "attributes": {
    "client_ip": "192.168.1.100",
    "user": "john",
    "timestamp": "02/Jan/2025:15:30:45 +0000",
    "method": "GET",
    "path": "/api/users",
    "protocol": "HTTP/1.1",
    "status_code": 200,
    "response_bytes": 1234
  },
  "timestamp": 1735833045000
}

The regex extracts each field from the Apache format, then converts numeric fields to integers.

Extract JSON Fields from Body

Parse embedded JSON from the log body and promote fields to attributes.

Tip: The Parse JSON Processor parses JSON automatically with options to merge or replace attributes.

Input

{
  "_type": "log",
  "body": "{\"level\":\"error\",\"message\":\"Connection failed\",\"code\":500,\"trace_id\":\"abc123\"}",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["json"], ParseJSON(Decode(body, "utf-8")))
set(attributes["level"], cache["json"]["level"])
set(attributes["message"], cache["json"]["message"])
set(attributes["error_code"], cache["json"]["code"])
set(attributes["trace_id"], cache["json"]["trace_id"])
set(severity_text, ConvertCase(attributes["level"], "upper"))

Output

{
  "_type": "log",
  "body": "{\"level\":\"error\",\"message\":\"Connection failed\",\"code\":500,\"trace_id\":\"abc123\"}",
  "resource": {},
  "attributes": {
    "level": "error",
    "message": "Connection failed",
    "error_code": 500,
    "trace_id": "abc123"
  },
  "severity_text": "ERROR",
  "timestamp": 1735833045000
}

ParseJSON converts the body to a map, and fields are extracted to attributes.

Data Masking Recipes

Alternative: The Mask Processor provides built-in patterns for credit cards, emails, IP addresses, and custom regex without writing OTTL code.

Mask Credit Card Numbers

Replace credit card numbers with masked values while preserving the last 4 digits.

Input

{
  "_type": "log",
  "body": "Payment processed for card 4532015112830366",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["body"], Decode(body, "utf-8"))
set(cache["cc_match"], ExtractPatterns(cache["body"], "\\b\\d{12}(?P<last4>\\d{4})\\b"))
set(cache["masked"], Concat(["****-****-****-", cache["cc_match"]["last4"]], "")) where cache["cc_match"]["last4"] != nil
replace_pattern(cache["body"], "\\b\\d{16}\\b", cache["masked"]) where cache["masked"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))

Output

{
  "_type": "log",
  "body": "Payment processed for card ****-****-****-0366",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

This approach first extracts the last 4 digits using a named capture group, then constructs the masked value and replaces the original number.

Redact Email Addresses

Mask email addresses while preserving the domain for debugging.

Input

{
  "_type": "log",
  "body": "User john.doe@example.com logged in from 192.168.1.1",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["body"], Decode(body, "utf-8"))
set(cache["email_match"], ExtractPatterns(cache["body"], "[a-zA-Z0-9._%+-]+@(?P<domain>[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,})"))
set(cache["redacted"], Concat(["[REDACTED]@", cache["email_match"]["domain"]], "")) where cache["email_match"]["domain"] != nil
replace_pattern(cache["body"], "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}", cache["redacted"]) where cache["redacted"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))

Output

{
  "_type": "log",
  "body": "User [REDACTED]@example.com logged in from 192.168.1.1",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

The email local part is masked while keeping the domain visible.

Hash Sensitive User IDs

Replace user IDs with hashed values for privacy compliance.

Input

{
  "_type": "log",
  "body": "Order created",
  "resource": {},
  "attributes": {
    "user_id": "user-12345",
    "order_id": "ORD-98765"
  },
  "timestamp": 1735833045000
}

Statement

set(attributes["user_id_hash"], SHA256(attributes["user_id"]))
set(attributes["user_id"], Substring(attributes["user_id_hash"], 0, 16))
delete_key(attributes, "user_id_hash")

Output

{
  "_type": "log",
  "body": "Order created",
  "resource": {},
  "attributes": {
    "user_id": "a1b2c3d4e5f67890",
    "order_id": "ORD-98765"
  },
  "timestamp": 1735833045000
}

The user ID is replaced with a truncated SHA256 hash.

Mask IP Addresses

Anonymize IP addresses while preserving network segments for analysis.

Input

{
  "_type": "log",
  "body": "Request from 192.168.100.55 to 10.0.50.25",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["body"], Decode(body, "utf-8"))
set(cache["ip_match"], ExtractPatterns(cache["body"], "(?P<prefix>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\.\\d{1,3}"))
set(cache["masked_ip"], Concat([cache["ip_match"]["prefix"], ".xxx"], "")) where cache["ip_match"]["prefix"] != nil
replace_pattern(cache["body"], "\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}", cache["masked_ip"]) where cache["masked_ip"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))

Output

{
  "_type": "log",
  "body": "Request from 192.168.100.xxx to 10.0.50.xxx",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

The last octet is masked while keeping the network prefix for subnet analysis.

Data Enrichment Recipes

Alternative: The Lookup Processor enriches data using lookup tables with exact, regex, prefix, and suffix matching. The Add Field Processor adds static values. For dynamic lookups, see the Redis Enrichment Processor.

Add Environment Metadata

Enrich logs with environment and service context.

Tip: For mapping namespaces to environments, a Lookup Processor with a lookup table is often simpler than conditional OTTL statements.

Input

{
  "_type": "log",
  "body": "Application started",
  "resource": {
    "k8s.namespace.name": "production",
    "k8s.deployment.name": "api-server"
  },
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(attributes["env"], "production") where resource["k8s.namespace.name"] == "production"
set(attributes["env"], "staging") where resource["k8s.namespace.name"] == "staging"
set(attributes["env"], "development") where attributes["env"] == nil
set(attributes["service"], resource["k8s.deployment.name"])
set(attributes["team"], "platform") where IsMatch(attributes["service"], "^(api|auth|gateway).*")
set(attributes["team"], "data") where IsMatch(attributes["service"], "^(etl|pipeline|processor).*")

Output

{
  "_type": "log",
  "body": "Application started",
  "resource": {
    "k8s.namespace.name": "production",
    "k8s.deployment.name": "api-server"
  },
  "attributes": {
    "env": "production",
    "service": "api-server",
    "team": "platform"
  },
  "timestamp": 1735833045000
}

Conditional logic maps namespace and service names to business metadata.

Add Request Classification

Classify requests based on path and method patterns.

Tip: A Lookup Processor with regex matching can classify paths without complex OTTL conditions.

Input

{
  "_type": "log",
  "body": "Request processed",
  "resource": {},
  "attributes": {
    "http.method": "GET",
    "http.route": "/api/v2/users/12345"
  },
  "timestamp": 1735833045000
}

Statement

set(attributes["api_version"], "v2") where IsMatch(attributes["http.route"], "^/api/v2/")
set(attributes["api_version"], "v1") where IsMatch(attributes["http.route"], "^/api/v1/")
set(attributes["resource_type"], "users") where IsMatch(attributes["http.route"], ".*/users.*")
set(attributes["resource_type"], "orders") where IsMatch(attributes["http.route"], ".*/orders.*")
set(attributes["operation"], "read") where attributes["http.method"] == "GET"
set(attributes["operation"], "write") where attributes["http.method"] == "POST" or attributes["http.method"] == "PUT"
set(attributes["operation"], "delete") where attributes["http.method"] == "DELETE"

Output

{
  "_type": "log",
  "body": "Request processed",
  "resource": {},
  "attributes": {
    "http.method": "GET",
    "http.route": "/api/v2/users/12345",
    "api_version": "v2",
    "resource_type": "users",
    "operation": "read"
  },
  "timestamp": 1735833045000
}

Request attributes are analyzed to add classification fields.

Calculate Response Time Buckets

Categorize response times for latency analysis.

Input

{
  "_type": "log",
  "body": "Request completed",
  "resource": {},
  "attributes": {
    "duration_ms": 250
  },
  "timestamp": 1735833045000
}

Statement

set(attributes["latency_bucket"], "fast") where attributes["duration_ms"] < 100
set(attributes["latency_bucket"], "normal") where attributes["duration_ms"] >= 100 and attributes["duration_ms"] < 500
set(attributes["latency_bucket"], "slow") where attributes["duration_ms"] >= 500 and attributes["duration_ms"] < 2000
set(attributes["latency_bucket"], "critical") where attributes["duration_ms"] >= 2000
set(attributes["slo_breach"], true) where attributes["duration_ms"] >= 1000
set(attributes["slo_breach"], false) where attributes["duration_ms"] < 1000

Output

{
  "_type": "log",
  "body": "Request completed",
  "resource": {},
  "attributes": {
    "duration_ms": 250,
    "latency_bucket": "normal",
    "slo_breach": false
  },
  "timestamp": 1735833045000
}

Duration values are categorized into buckets for dashboards and alerts.

Data Reduction Recipes

Alternative: The Delete Field Processor removes specific fields, and Delete Empty Values Processor removes nulls and empty strings.

Compact Apache Logs

Reduce Apache log size by removing unnecessary fields and shortening values.

Input

{
  "_type": "log",
  "body": "{\"host\":\"192.168.1.45\",\"user\":\"anonymous\",\"method\":\"GET\",\"request\":\"/api/v2/products\",\"protocol\":\"HTTP/2.0\",\"status\":200,\"bytes\":4523,\"referrer\":\"https://example.com/shop/category/electronics\",\"agent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36\"}",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["log"], ParseJSON(Decode(body, "utf-8")))

// Map user agent to short code
set(cache["ua"], "CHR") where IsMatch(String(cache["log"]["agent"]), ".*Chrome.*")
set(cache["ua"], "FFX") where IsMatch(String(cache["log"]["agent"]), ".*Firefox.*")
set(cache["ua"], "SAF") where IsMatch(String(cache["log"]["agent"]), ".*Safari.*") and not IsMatch(String(cache["log"]["agent"]), ".*Chrome.*")
set(cache["ua"], "OTH") where cache["ua"] == nil

// Build compact output
set(body, EDXEncode(Format("{\"h\":\"%s\",\"m\":\"%s\",\"p\":\"%s\",\"s\":%d,\"b\":%d,\"u\":\"%s\"}",
  [String(cache["log"]["host"]), String(cache["log"]["method"]), String(cache["log"]["request"]),
   Int(cache["log"]["status"]), Int(cache["log"]["bytes"]), cache["ua"]]), "utf-8"))

Output

{
  "_type": "log",
  "body": "{\"h\":\"192.168.1.45\",\"m\":\"GET\",\"p\":\"/api/v2/products\",\"s\":200,\"b\":4523,\"u\":\"CHR\"}",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

The log is reduced from ~350 bytes to ~85 bytes (75% reduction) by removing non-essential fields and shortening keys.

Drop Debug Fields

Remove verbose debugging information from production logs.

Tip: For removing specific known fields, the Delete Field Processor is simpler. Use OTTL delete_matching_keys when you need regex pattern matching.

Input

{
  "_type": "log",
  "body": "Processing request",
  "resource": {},
  "attributes": {
    "request_id": "req-12345",
    "user_id": "user-789",
    "debug_stack_trace": "at com.example.Service.process(Service.java:123)...",
    "debug_memory_usage": "512MB",
    "debug_thread_info": "pool-1-thread-5",
    "internal_routing_key": "shard-3"
  },
  "timestamp": 1735833045000
}

Statement

delete_matching_keys(attributes, "^debug_.*")
delete_matching_keys(attributes, "^internal_.*")

Output

{
  "_type": "log",
  "body": "Processing request",
  "resource": {},
  "attributes": {
    "request_id": "req-12345",
    "user_id": "user-789"
  },
  "timestamp": 1735833045000
}

All fields matching the debug and internal patterns are removed.

Truncate Long Values

Limit field lengths to prevent oversized logs.

Input

{
  "_type": "log",
  "body": "Error occurred",
  "resource": {},
  "attributes": {
    "error_message": "Connection refused: Unable to establish connection to database server at host db.example.com:5432 after multiple retry attempts with exponential backoff strategy. The connection pool is exhausted and all available connections are in use by other requests.",
    "stack_trace": "java.sql.SQLException: Connection refused\n\tat com.example.db.ConnectionPool.getConnection(ConnectionPool.java:234)\n\tat com.example.service.UserService.findUser(UserService.java:89)\n\tat com.example.controller.UserController.getUser(UserController.java:45)"
  },
  "timestamp": 1735833045000
}

Statement

set(attributes["error_message"], Concat([Substring(attributes["error_message"], 0, 100), "..."], "")) where Len(attributes["error_message"]) > 100
set(attributes["stack_trace"], Concat([Substring(attributes["stack_trace"], 0, 200), "...[truncated]"], "")) where Len(attributes["stack_trace"]) > 200

Output

{
  "_type": "log",
  "body": "Error occurred",
  "resource": {},
  "attributes": {
    "error_message": "Connection refused: Unable to establish connection to database server at host db.example.com:5432...",
    "stack_trace": "java.sql.SQLException: Connection refused\n\tat com.example.db.ConnectionPool.getConnection(ConnectionPool.java:234)\n\tat com.example.service.UserService.findUser(UserService.java:89)\n\tat...[truncated]"
  },
  "timestamp": 1735833045000
}

Long values are truncated with indicators to show data was shortened.

Conditional Processing Recipes

Alternative: Use a Route Node to send data to different destinations based on conditions. The Filter Processor drops data items based on include/exclude conditions.

Route by Log Level

Apply different transformations based on severity.

Input

{
  "_type": "log",
  "body": "{\"level\":\"error\",\"message\":\"Database connection failed\",\"code\":\"DB_ERR_001\"}",
  "resource": {},
  "attributes": {},
  "timestamp": 1735833045000
}

Statement

set(cache["log"], ParseJSON(Decode(body, "utf-8")))
set(attributes["level"], cache["log"]["level"])
set(attributes["message"], cache["log"]["message"])

// Error logs get full context
set(attributes["error_code"], cache["log"]["code"]) where attributes["level"] == "error"
set(attributes["alert_priority"], "high") where attributes["level"] == "error"
set(attributes["retention_days"], 90) where attributes["level"] == "error"

// Info logs get minimal context
set(attributes["retention_days"], 7) where attributes["level"] == "info"

// Debug logs are marked for potential dropping
set(attributes["can_drop"], true) where attributes["level"] == "debug"

Output

{
  "_type": "log",
  "body": "{\"level\":\"error\",\"message\":\"Database connection failed\",\"code\":\"DB_ERR_001\"}",
  "resource": {},
  "attributes": {
    "level": "error",
    "message": "Database connection failed",
    "error_code": "DB_ERR_001",
    "alert_priority": "high",
    "retention_days": 90
  },
  "timestamp": 1735833045000
}

Conditional processing adds different fields based on log severity.

Normalize Status Codes

Convert various status representations to a standard format.

Tip: A Lookup Processor with a status mapping table simplifies code-to-category conversions. See also the Parse Severity Processor for log level normalization.

Input

{
  "_type": "log",
  "body": "Request processed",
  "resource": {},
  "attributes": {
    "status": "OK",
    "code": 200
  },
  "timestamp": 1735833045000
}

Statement

// Normalize text status to numeric
set(attributes["status_code"], 200) where attributes["status"] == "OK" or attributes["status"] == "success"
set(attributes["status_code"], 400) where attributes["status"] == "bad_request" or attributes["status"] == "invalid"
set(attributes["status_code"], 500) where attributes["status"] == "error" or attributes["status"] == "failure"

// Use existing numeric code if present
set(attributes["status_code"], attributes["code"]) where attributes["code"] != nil

// Add status category
set(attributes["status_category"], "success") where attributes["status_code"] >= 200 and attributes["status_code"] < 300
set(attributes["status_category"], "redirect") where attributes["status_code"] >= 300 and attributes["status_code"] < 400
set(attributes["status_category"], "client_error") where attributes["status_code"] >= 400 and attributes["status_code"] < 500
set(attributes["status_category"], "server_error") where attributes["status_code"] >= 500

// Clean up original fields
delete_key(attributes, "status")
delete_key(attributes, "code")

Output

{
  "_type": "log",
  "body": "Request processed",
  "resource": {},
  "attributes": {
    "status_code": 200,
    "status_category": "success"
  },
  "timestamp": 1735833045000
}

Various status representations are normalized to standard HTTP codes with categories.

Best Practices

Use Cache for Multi-Step Transformations

Store intermediate values in cache to avoid redundant parsing:

// Good: Parse once, use many times
set(cache["body"], Decode(body, "utf-8"))
set(cache["json"], ParseJSON(cache["body"]))
set(attributes["field1"], cache["json"]["field1"])
set(attributes["field2"], cache["json"]["field2"])

// Bad: Parse multiple times
set(attributes["field1"], ParseJSON(Decode(body, "utf-8"))["field1"])
set(attributes["field2"], ParseJSON(Decode(body, "utf-8"))["field2"])

Order Matters for Conditional Statements

Place more specific conditions before general ones:

// Good: Specific before general
set(attributes["severity"], "critical") where attributes["level"] == "error" and attributes["code"] >= 500
set(attributes["severity"], "error") where attributes["level"] == "error"
set(attributes["severity"], "info") where attributes["level"] == "info"

// Bad: General catches all
set(attributes["severity"], "error") where attributes["level"] == "error"
set(attributes["severity"], "critical") where attributes["level"] == "error" and attributes["code"] >= 500  // Never reached

Clean Up After Transformations

Remove cache and temporary fields after processing:

set(cache["parsed"], ParseJSON(Decode(body, "utf-8")))
set(attributes["user"], cache["parsed"]["user"])
set(attributes["action"], cache["parsed"]["action"])
// Clean up
delete_key(cache, "parsed")

Validate Before Transforming

Check field existence before using values:

set(attributes["user_upper"], ConvertCase(attributes["user"], "upper")) where attributes["user"] != nil and IsString(attributes["user"])

See Also