ExtractGrokPatterns

Learn about the ExtractGrokPatterns OTTL converter function.

The ExtractGrokPatterns converter uses Grok patterns to extract data from a string.

Syntax: ExtractGrokPatterns(string, grokPattern)

  • string: the bracket notation location of the string field
  • grokPattern: the grok pattern to use for extraction

Input

{
	"_type": "log",
	"attributes": {
		"decoded_body": "time=1724177404|hostname=CPLPOL32|product=Firewall|layer_name=ENGCORE_MASTER"
	},
	"body": "time=1724177404|hostname=CPLPOL32|product=Firewall|layer_name=ENGCORE_MASTER",
	"resource": {
		"ed.conf.id": "123456789",
		"ed.domain": "pipeline",
		"ed.org.id": "987654321",
		"ed.source.name": "__ed_dummy_test_input",
		"ed.source.type": "memory_input",
		"ed.tag": "loggen",
		"host.ip": "10.0.0.1",
		"host.name": "ED_TEST",
		"service.name": "ed-tester",
		"src_type": "memory_input"
	},
	"timestamp": 1733727200176
}

Statement

set(attributes["grokked"], ExtractGrokPatterns(attributes["decoded_body"], "time=(?P<log_timestamp>\\d+)\\|hostname=(?P<log_hostname>[^|]+)\\|product=(?P<log_product>[^|]+)\\|layer_name=(?P<log_layer_name>[^|]+)", true))

Output

{
	"_type": "log",
	"attributes": {
		"decoded_body": "time=1724177404|hostname=CPLPOL32|product=Firewall|layer_name=ENGCORE_MASTER",
		"grokked": {
			"log_hostname": "CPLPOL32",
			"log_layer_name": "ENGCORE_MASTER",
			"log_product": "Firewall",
			"log_timestamp": "1724177404"
		}
	},
	"body": "time=1724177404|hostname=CPLPOL32|product=Firewall|layer_name=ENGCORE_MASTER",
	"resource": {
		"ed.conf.id": "123456789",
		"ed.domain": "pipeline",
		"ed.org.id": "987654321",
		"ed.source.name": "__ed_dummy_test_input",
		"ed.source.type": "memory_input",
		"ed.tag": "loggen",
		"host.ip": "10.0.0.1",
		"host.name": "ED_TEST",
		"service.name": "ed-tester",
		"src_type": "memory_input"
	},
	"timestamp": 1733727245095
}

The ExtractGrokPatterns function was applied to extract structured data from the decoded_body attribute, which contained log information in a single string format. The transformation used a regular expression pattern to parse and extract parts of the log into key-value pairs, which were then stored in an attribute map called grokked.

Example: Extracting and Parsing Nested Data

This example shows how to extract a message from JSON, parse it with grok patterns, and then use ParseKeyValue to further parse component data.

Input

{
  "_type": "log",
  "timestamp": 1762912482420,
  "body": {
    "message": "2025-11-12T01:54:41Z  INFO service{component=api,version=1.2.3,region=us-west}:request{id=req-94}: User authentication successful",
    "seq": 94
  },
  "resource": {
    "ed.source.name": "kubernetes_input_e389",
    "ed.source.type": "kubernetes_input",
    "k8s.namespace.name": "busy",
    "k8s.pod.name": "test-app"
  },
  "attributes": {}
}

Statements

set(cache["message"], body["message"])
set(attributes["message_data"], ExtractGrokPatterns(cache["message"], "^(?P<log_timestamp>.*Z)  (?P<log_level>[A-Z]+)\\s[a-z]+\\{(?P<component>[^}]+)}:request\\{(?<request>[^}]+)}: (?P<message_new>[^}]+)"))
// put component data in cache
set(cache["component"], attributes["message_data"]["component"])
// update component info - parse key=value pairs separated by commas
set(attributes["message_data"]["component"], ParseKeyValue(cache["component"], "=",","))

Output

{
  "_type": "log",
  "timestamp": 1762912482420,
  "body": {
    "message": "2025-11-12T01:54:41Z  INFO service{component=api,version=1.2.3,region=us-west}:request{id=req-94}: User authentication successful",
    "seq": 94
  },
  "resource": {
    "ed.source.name": "kubernetes_input_e389",
    "ed.source.type": "kubernetes_input",
    "k8s.namespace.name": "busy",
    "k8s.pod.name": "test-app"
  },
  "attributes": {
    "message_data": {
      "component": {
        "component": "api",
        "region": "us-west",
        "version": "1.2.3"
      },
      "log_level": "INFO",
      "log_timestamp": "2025-11-12T01:54:41Z",
      "message_new": "User authentication successful",
      "request": "id=req-94"
    }
  }
}

This example demonstrates:

  1. Extracting the “message” string from a parsed JSON body field
  2. Using ExtractGrokPatterns to parse the structured log message into named fields
  3. Further parsing the component field using ParseKeyValue to convert key=value,key=value format into a nested map