OTTL Cookbook
11 minute read
Overview
This cookbook provides ready-to-use OTTL recipes for common telemetry transformation tasks. Each recipe includes complete input/output examples that you can test in the Edge Delta Visual Pipeline Builder.
When to Use OTTL vs. Built-in Processors
Edge Delta provides dedicated processors for many common tasks. Use OTTL when you need:
- Complex multi-step transformations that combine parsing, conditional logic, and field manipulation
- Custom regex patterns beyond what built-in processors offer
- Cache-based workflows where intermediate values are needed
- Chained transformations where output of one step feeds into another
For simpler single-purpose tasks, consider these built-in processors:
| Task | Built-in Processor |
|---|---|
| Parse JSON | Parse JSON Processor |
| Parse key=value | Parse Key-Value Processor |
| Extract with regex | Parse Regex Processor or Parse Grok Processor |
| Mask sensitive data | Mask Processor |
| Enrich from tables | Lookup Processor |
| Add static fields | Add Field Processor |
| Remove fields | Delete Field Processor |
| Filter/drop logs | Filter Processor |
All recipes follow the standard OTTL pattern:
- Parse or access input data
- Transform using editor and converter functions
- Store results in attributes or update the body
Data Extraction Recipes
Extract IP Addresses from Log Body
Extract IPv4 addresses from unstructured log text using regex patterns.
Tip: For simpler regex extraction, consider the Parse Regex Processor or Parse Grok Processor which provide a visual interface.
Input
{
"_type": "log",
"body": "Connection from 192.168.1.45 to server 10.0.0.1:8080 established",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(attributes, ExtractPatterns(Decode(body, "utf-8"), "(?P<source_ip>\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\b.*(?P<dest_ip>\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\b"))
Output
{
"_type": "log",
"body": "Connection from 192.168.1.45 to server 10.0.0.1:8080 established",
"resource": {},
"attributes": {
"source_ip": "192.168.1.45",
"dest_ip": "10.0.0.1"
},
"timestamp": 1735833045000
}
The regex captures two named groups that become attribute fields.
Parse Key-Value Pairs from Logs
Extract structured data from key=value formatted logs.
Tip: The Parse Key-Value Processor handles this with configurable delimiters and no code required.
Input
{
"_type": "log",
"body": "user=admin action=login status=success duration=45ms",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["body_str"], Decode(body, "utf-8"))
set(cache["parsed"], ParseKeyValue(cache["body_str"], "=", " "))
merge_maps(attributes, cache["parsed"], "insert")
Output
{
"_type": "log",
"body": "user=admin action=login status=success duration=45ms",
"resource": {},
"attributes": {
"user": "admin",
"action": "login",
"status": "success",
"duration": "45ms"
},
"timestamp": 1735833045000
}
ParseKeyValue splits the log into a map using the specified delimiters.
Parse Apache/Nginx Access Logs
Extract fields from standard Apache Combined Log Format.
Tip: The Parse Grok Processor includes built-in patterns for Apache and Nginx logs. See also the Syslog Pack for syslog-wrapped access logs.
Input
{
"_type": "log",
"body": "192.168.1.100 - john [02/Jan/2025:15:30:45 +0000] \"GET /api/users HTTP/1.1\" 200 1234 \"https://example.com\" \"Mozilla/5.0\"",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["body"], Decode(body, "utf-8"))
set(attributes, ExtractPatterns(cache["body"], "(?P<client_ip>\\d+\\.\\d+\\.\\d+\\.\\d+)\\s+-\\s+(?P<user>\\S+)\\s+\\[(?P<timestamp>[^\\]]+)\\]\\s+\"(?P<method>\\w+)\\s+(?P<path>\\S+)\\s+(?P<protocol>[^\"]+)\"\\s+(?P<status>\\d+)\\s+(?P<bytes>\\d+)"))
set(attributes["status_code"], Int(attributes["status"]))
set(attributes["response_bytes"], Int(attributes["bytes"]))
delete_key(attributes, "status")
delete_key(attributes, "bytes")
Output
{
"_type": "log",
"body": "192.168.1.100 - john [02/Jan/2025:15:30:45 +0000] \"GET /api/users HTTP/1.1\" 200 1234 \"https://example.com\" \"Mozilla/5.0\"",
"resource": {},
"attributes": {
"client_ip": "192.168.1.100",
"user": "john",
"timestamp": "02/Jan/2025:15:30:45 +0000",
"method": "GET",
"path": "/api/users",
"protocol": "HTTP/1.1",
"status_code": 200,
"response_bytes": 1234
},
"timestamp": 1735833045000
}
The regex extracts each field from the Apache format, then converts numeric fields to integers.
Extract JSON Fields from Body
Parse embedded JSON from the log body and promote fields to attributes.
Tip: The Parse JSON Processor parses JSON automatically with options to merge or replace attributes.
Input
{
"_type": "log",
"body": "{\"level\":\"error\",\"message\":\"Connection failed\",\"code\":500,\"trace_id\":\"abc123\"}",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["json"], ParseJSON(Decode(body, "utf-8")))
set(attributes["level"], cache["json"]["level"])
set(attributes["message"], cache["json"]["message"])
set(attributes["error_code"], cache["json"]["code"])
set(attributes["trace_id"], cache["json"]["trace_id"])
set(severity_text, ConvertCase(attributes["level"], "upper"))
Output
{
"_type": "log",
"body": "{\"level\":\"error\",\"message\":\"Connection failed\",\"code\":500,\"trace_id\":\"abc123\"}",
"resource": {},
"attributes": {
"level": "error",
"message": "Connection failed",
"error_code": 500,
"trace_id": "abc123"
},
"severity_text": "ERROR",
"timestamp": 1735833045000
}
ParseJSON converts the body to a map, and fields are extracted to attributes.
Data Masking Recipes
Alternative: The Mask Processor provides built-in patterns for credit cards, emails, IP addresses, and custom regex without writing OTTL code.
Mask Credit Card Numbers
Replace credit card numbers with masked values while preserving the last 4 digits.
Input
{
"_type": "log",
"body": "Payment processed for card 4532015112830366",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["body"], Decode(body, "utf-8"))
set(cache["cc_match"], ExtractPatterns(cache["body"], "\\b\\d{12}(?P<last4>\\d{4})\\b"))
set(cache["masked"], Concat(["****-****-****-", cache["cc_match"]["last4"]], "")) where cache["cc_match"]["last4"] != nil
replace_pattern(cache["body"], "\\b\\d{16}\\b", cache["masked"]) where cache["masked"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))
Output
{
"_type": "log",
"body": "Payment processed for card ****-****-****-0366",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
This approach first extracts the last 4 digits using a named capture group, then constructs the masked value and replaces the original number.
Redact Email Addresses
Mask email addresses while preserving the domain for debugging.
Input
{
"_type": "log",
"body": "User john.doe@example.com logged in from 192.168.1.1",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["body"], Decode(body, "utf-8"))
set(cache["email_match"], ExtractPatterns(cache["body"], "[a-zA-Z0-9._%+-]+@(?P<domain>[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,})"))
set(cache["redacted"], Concat(["[REDACTED]@", cache["email_match"]["domain"]], "")) where cache["email_match"]["domain"] != nil
replace_pattern(cache["body"], "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}", cache["redacted"]) where cache["redacted"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))
Output
{
"_type": "log",
"body": "User [REDACTED]@example.com logged in from 192.168.1.1",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
The email local part is masked while keeping the domain visible.
Hash Sensitive User IDs
Replace user IDs with hashed values for privacy compliance.
Input
{
"_type": "log",
"body": "Order created",
"resource": {},
"attributes": {
"user_id": "user-12345",
"order_id": "ORD-98765"
},
"timestamp": 1735833045000
}
Statement
set(attributes["user_id_hash"], SHA256(attributes["user_id"]))
set(attributes["user_id"], Substring(attributes["user_id_hash"], 0, 16))
delete_key(attributes, "user_id_hash")
Output
{
"_type": "log",
"body": "Order created",
"resource": {},
"attributes": {
"user_id": "a1b2c3d4e5f67890",
"order_id": "ORD-98765"
},
"timestamp": 1735833045000
}
The user ID is replaced with a truncated SHA256 hash.
Mask IP Addresses
Anonymize IP addresses while preserving network segments for analysis.
Input
{
"_type": "log",
"body": "Request from 192.168.100.55 to 10.0.50.25",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["body"], Decode(body, "utf-8"))
set(cache["ip_match"], ExtractPatterns(cache["body"], "(?P<prefix>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\.\\d{1,3}"))
set(cache["masked_ip"], Concat([cache["ip_match"]["prefix"], ".xxx"], "")) where cache["ip_match"]["prefix"] != nil
replace_pattern(cache["body"], "\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}", cache["masked_ip"]) where cache["masked_ip"] != nil
set(body, EDXEncode(cache["body"], "utf-8"))
Output
{
"_type": "log",
"body": "Request from 192.168.100.xxx to 10.0.50.xxx",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
The last octet is masked while keeping the network prefix for subnet analysis.
Data Enrichment Recipes
Alternative: The Lookup Processor enriches data using lookup tables with exact, regex, prefix, and suffix matching. The Add Field Processor adds static values. For dynamic lookups, see the Redis Enrichment Processor.
Add Environment Metadata
Enrich logs with environment and service context.
Tip: For mapping namespaces to environments, a Lookup Processor with a lookup table is often simpler than conditional OTTL statements.
Input
{
"_type": "log",
"body": "Application started",
"resource": {
"k8s.namespace.name": "production",
"k8s.deployment.name": "api-server"
},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(attributes["env"], "production") where resource["k8s.namespace.name"] == "production"
set(attributes["env"], "staging") where resource["k8s.namespace.name"] == "staging"
set(attributes["env"], "development") where attributes["env"] == nil
set(attributes["service"], resource["k8s.deployment.name"])
set(attributes["team"], "platform") where IsMatch(attributes["service"], "^(api|auth|gateway).*")
set(attributes["team"], "data") where IsMatch(attributes["service"], "^(etl|pipeline|processor).*")
Output
{
"_type": "log",
"body": "Application started",
"resource": {
"k8s.namespace.name": "production",
"k8s.deployment.name": "api-server"
},
"attributes": {
"env": "production",
"service": "api-server",
"team": "platform"
},
"timestamp": 1735833045000
}
Conditional logic maps namespace and service names to business metadata.
Add Request Classification
Classify requests based on path and method patterns.
Tip: A Lookup Processor with regex matching can classify paths without complex OTTL conditions.
Input
{
"_type": "log",
"body": "Request processed",
"resource": {},
"attributes": {
"http.method": "GET",
"http.route": "/api/v2/users/12345"
},
"timestamp": 1735833045000
}
Statement
set(attributes["api_version"], "v2") where IsMatch(attributes["http.route"], "^/api/v2/")
set(attributes["api_version"], "v1") where IsMatch(attributes["http.route"], "^/api/v1/")
set(attributes["resource_type"], "users") where IsMatch(attributes["http.route"], ".*/users.*")
set(attributes["resource_type"], "orders") where IsMatch(attributes["http.route"], ".*/orders.*")
set(attributes["operation"], "read") where attributes["http.method"] == "GET"
set(attributes["operation"], "write") where attributes["http.method"] == "POST" or attributes["http.method"] == "PUT"
set(attributes["operation"], "delete") where attributes["http.method"] == "DELETE"
Output
{
"_type": "log",
"body": "Request processed",
"resource": {},
"attributes": {
"http.method": "GET",
"http.route": "/api/v2/users/12345",
"api_version": "v2",
"resource_type": "users",
"operation": "read"
},
"timestamp": 1735833045000
}
Request attributes are analyzed to add classification fields.
Calculate Response Time Buckets
Categorize response times for latency analysis.
Input
{
"_type": "log",
"body": "Request completed",
"resource": {},
"attributes": {
"duration_ms": 250
},
"timestamp": 1735833045000
}
Statement
set(attributes["latency_bucket"], "fast") where attributes["duration_ms"] < 100
set(attributes["latency_bucket"], "normal") where attributes["duration_ms"] >= 100 and attributes["duration_ms"] < 500
set(attributes["latency_bucket"], "slow") where attributes["duration_ms"] >= 500 and attributes["duration_ms"] < 2000
set(attributes["latency_bucket"], "critical") where attributes["duration_ms"] >= 2000
set(attributes["slo_breach"], true) where attributes["duration_ms"] >= 1000
set(attributes["slo_breach"], false) where attributes["duration_ms"] < 1000
Output
{
"_type": "log",
"body": "Request completed",
"resource": {},
"attributes": {
"duration_ms": 250,
"latency_bucket": "normal",
"slo_breach": false
},
"timestamp": 1735833045000
}
Duration values are categorized into buckets for dashboards and alerts.
Data Reduction Recipes
Alternative: The Delete Field Processor removes specific fields, and Delete Empty Values Processor removes nulls and empty strings.
Compact Apache Logs
Reduce Apache log size by removing unnecessary fields and shortening values.
Input
{
"_type": "log",
"body": "{\"host\":\"192.168.1.45\",\"user\":\"anonymous\",\"method\":\"GET\",\"request\":\"/api/v2/products\",\"protocol\":\"HTTP/2.0\",\"status\":200,\"bytes\":4523,\"referrer\":\"https://example.com/shop/category/electronics\",\"agent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36\"}",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["log"], ParseJSON(Decode(body, "utf-8")))
// Map user agent to short code
set(cache["ua"], "CHR") where IsMatch(String(cache["log"]["agent"]), ".*Chrome.*")
set(cache["ua"], "FFX") where IsMatch(String(cache["log"]["agent"]), ".*Firefox.*")
set(cache["ua"], "SAF") where IsMatch(String(cache["log"]["agent"]), ".*Safari.*") and not IsMatch(String(cache["log"]["agent"]), ".*Chrome.*")
set(cache["ua"], "OTH") where cache["ua"] == nil
// Build compact output
set(body, EDXEncode(Format("{\"h\":\"%s\",\"m\":\"%s\",\"p\":\"%s\",\"s\":%d,\"b\":%d,\"u\":\"%s\"}",
[String(cache["log"]["host"]), String(cache["log"]["method"]), String(cache["log"]["request"]),
Int(cache["log"]["status"]), Int(cache["log"]["bytes"]), cache["ua"]]), "utf-8"))
Output
{
"_type": "log",
"body": "{\"h\":\"192.168.1.45\",\"m\":\"GET\",\"p\":\"/api/v2/products\",\"s\":200,\"b\":4523,\"u\":\"CHR\"}",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
The log is reduced from ~350 bytes to ~85 bytes (75% reduction) by removing non-essential fields and shortening keys.
Drop Debug Fields
Remove verbose debugging information from production logs.
Tip: For removing specific known fields, the Delete Field Processor is simpler. Use OTTL
delete_matching_keyswhen you need regex pattern matching.
Input
{
"_type": "log",
"body": "Processing request",
"resource": {},
"attributes": {
"request_id": "req-12345",
"user_id": "user-789",
"debug_stack_trace": "at com.example.Service.process(Service.java:123)...",
"debug_memory_usage": "512MB",
"debug_thread_info": "pool-1-thread-5",
"internal_routing_key": "shard-3"
},
"timestamp": 1735833045000
}
Statement
delete_matching_keys(attributes, "^debug_.*")
delete_matching_keys(attributes, "^internal_.*")
Output
{
"_type": "log",
"body": "Processing request",
"resource": {},
"attributes": {
"request_id": "req-12345",
"user_id": "user-789"
},
"timestamp": 1735833045000
}
All fields matching the debug and internal patterns are removed.
Truncate Long Values
Limit field lengths to prevent oversized logs.
Input
{
"_type": "log",
"body": "Error occurred",
"resource": {},
"attributes": {
"error_message": "Connection refused: Unable to establish connection to database server at host db.example.com:5432 after multiple retry attempts with exponential backoff strategy. The connection pool is exhausted and all available connections are in use by other requests.",
"stack_trace": "java.sql.SQLException: Connection refused\n\tat com.example.db.ConnectionPool.getConnection(ConnectionPool.java:234)\n\tat com.example.service.UserService.findUser(UserService.java:89)\n\tat com.example.controller.UserController.getUser(UserController.java:45)"
},
"timestamp": 1735833045000
}
Statement
set(attributes["error_message"], Concat([Substring(attributes["error_message"], 0, 100), "..."], "")) where Len(attributes["error_message"]) > 100
set(attributes["stack_trace"], Concat([Substring(attributes["stack_trace"], 0, 200), "...[truncated]"], "")) where Len(attributes["stack_trace"]) > 200
Output
{
"_type": "log",
"body": "Error occurred",
"resource": {},
"attributes": {
"error_message": "Connection refused: Unable to establish connection to database server at host db.example.com:5432...",
"stack_trace": "java.sql.SQLException: Connection refused\n\tat com.example.db.ConnectionPool.getConnection(ConnectionPool.java:234)\n\tat com.example.service.UserService.findUser(UserService.java:89)\n\tat...[truncated]"
},
"timestamp": 1735833045000
}
Long values are truncated with indicators to show data was shortened.
Conditional Processing Recipes
Alternative: Use a Route Node to send data to different destinations based on conditions. The Filter Processor drops data items based on include/exclude conditions.
Route by Log Level
Apply different transformations based on severity.
Input
{
"_type": "log",
"body": "{\"level\":\"error\",\"message\":\"Database connection failed\",\"code\":\"DB_ERR_001\"}",
"resource": {},
"attributes": {},
"timestamp": 1735833045000
}
Statement
set(cache["log"], ParseJSON(Decode(body, "utf-8")))
set(attributes["level"], cache["log"]["level"])
set(attributes["message"], cache["log"]["message"])
// Error logs get full context
set(attributes["error_code"], cache["log"]["code"]) where attributes["level"] == "error"
set(attributes["alert_priority"], "high") where attributes["level"] == "error"
set(attributes["retention_days"], 90) where attributes["level"] == "error"
// Info logs get minimal context
set(attributes["retention_days"], 7) where attributes["level"] == "info"
// Debug logs are marked for potential dropping
set(attributes["can_drop"], true) where attributes["level"] == "debug"
Output
{
"_type": "log",
"body": "{\"level\":\"error\",\"message\":\"Database connection failed\",\"code\":\"DB_ERR_001\"}",
"resource": {},
"attributes": {
"level": "error",
"message": "Database connection failed",
"error_code": "DB_ERR_001",
"alert_priority": "high",
"retention_days": 90
},
"timestamp": 1735833045000
}
Conditional processing adds different fields based on log severity.
Normalize Status Codes
Convert various status representations to a standard format.
Tip: A Lookup Processor with a status mapping table simplifies code-to-category conversions. See also the Parse Severity Processor for log level normalization.
Input
{
"_type": "log",
"body": "Request processed",
"resource": {},
"attributes": {
"status": "OK",
"code": 200
},
"timestamp": 1735833045000
}
Statement
// Normalize text status to numeric
set(attributes["status_code"], 200) where attributes["status"] == "OK" or attributes["status"] == "success"
set(attributes["status_code"], 400) where attributes["status"] == "bad_request" or attributes["status"] == "invalid"
set(attributes["status_code"], 500) where attributes["status"] == "error" or attributes["status"] == "failure"
// Use existing numeric code if present
set(attributes["status_code"], attributes["code"]) where attributes["code"] != nil
// Add status category
set(attributes["status_category"], "success") where attributes["status_code"] >= 200 and attributes["status_code"] < 300
set(attributes["status_category"], "redirect") where attributes["status_code"] >= 300 and attributes["status_code"] < 400
set(attributes["status_category"], "client_error") where attributes["status_code"] >= 400 and attributes["status_code"] < 500
set(attributes["status_category"], "server_error") where attributes["status_code"] >= 500
// Clean up original fields
delete_key(attributes, "status")
delete_key(attributes, "code")
Output
{
"_type": "log",
"body": "Request processed",
"resource": {},
"attributes": {
"status_code": 200,
"status_category": "success"
},
"timestamp": 1735833045000
}
Various status representations are normalized to standard HTTP codes with categories.
Best Practices
Use Cache for Multi-Step Transformations
Store intermediate values in cache to avoid redundant parsing:
// Good: Parse once, use many times
set(cache["body"], Decode(body, "utf-8"))
set(cache["json"], ParseJSON(cache["body"]))
set(attributes["field1"], cache["json"]["field1"])
set(attributes["field2"], cache["json"]["field2"])
// Bad: Parse multiple times
set(attributes["field1"], ParseJSON(Decode(body, "utf-8"))["field1"])
set(attributes["field2"], ParseJSON(Decode(body, "utf-8"))["field2"])
Order Matters for Conditional Statements
Place more specific conditions before general ones:
// Good: Specific before general
set(attributes["severity"], "critical") where attributes["level"] == "error" and attributes["code"] >= 500
set(attributes["severity"], "error") where attributes["level"] == "error"
set(attributes["severity"], "info") where attributes["level"] == "info"
// Bad: General catches all
set(attributes["severity"], "error") where attributes["level"] == "error"
set(attributes["severity"], "critical") where attributes["level"] == "error" and attributes["code"] >= 500 // Never reached
Clean Up After Transformations
Remove cache and temporary fields after processing:
set(cache["parsed"], ParseJSON(Decode(body, "utf-8")))
set(attributes["user"], cache["parsed"]["user"])
set(attributes["action"], cache["parsed"]["action"])
// Clean up
delete_key(cache, "parsed")
Validate Before Transforming
Check field existence before using values:
set(attributes["user_upper"], ConvertCase(attributes["user"], "upper")) where attributes["user"] != nil and IsString(attributes["user"])
See Also
- OTTL Language Guide - Complete syntax reference
- OTTL Time Conversion - Time and timestamp recipes
- OTTL Editor Functions - All editor functions
- OTTL Converter Functions - All converter functions
- Edge Delta Custom Functions - Extended functionality