Edge Delta Parse Grok Processor
5 minute read
Overview
The Grok parsing processor is used to extract structured fields from unstructured log data using Grok patterns. It processes the log body by matching it against a provided Grok pattern — either from the built-in Knowledge Library or a custom pattern. If a match is successful, the extracted fields are stored in the attributes field or custom field. If the destination field, such as attributes already exists, the new fields are merged into it using an upsert strategy. If no match is found (and strict matching is enabled), no new attributes are added.
For detailed instructions on how to use multiprocessors, see Use Multiprocessors.
Grok patterns themselves are human-readable regex macros. The parsing processor uses these patterns to identify meaningful fields like IP addresses, HTTP methods, status codes, etc. You can define your own, select one from the library, or use the AI assistant to generate one from a sample log.
Configuration
Consider this log body:
<14> 1 2025-06-23T10:24:44.849103Z logserver01 monitor 28968 ID167 - New entry
This pattern will can be used to parse the log:
<%{NUMBER:syslog_priority}> %{NUMBER:syslog_version} %{TIMESTAMP_ISO8601:timestamp} %{WORD:hostname} %{WORD:appname} %{NUMBER:pid} %{WORD:log_id} - %{GREEDYDATA:message}
It will create the following structure:
%{NUMBER:syslog_priority}
: Extracts the syslog priority number.%{NUMBER:syslog_version}
: Captures the syslog version number.%{TIMESTAMP_ISO8601:timestamp}
: Captures the ISO 8601 timestamp.%{WORD:hostname}
: Captures the hostname.%{WORD:appname}
: Captures the application or service name.%{NUMBER:pid}
: Captures the process ID.%{WORD:log_id}
: Captures the log ID.%{GREEDYDATA:message}
: Captures the rest of the log entry as the message.
The processor is configured as follows:

This configuration can be represented with the following YAML
- name: kubernetes_input_multiprocessor
type: sequence
processors:
- type: ottl_transform
metadata: '{"id":"jAV8KAUBPP8WQc1dXREQZ","type":"parse-grok","name":"Parse Grok"}'
data_types:
- log
statements: |-
merge_maps(attributes, ExtractGrokPatterns(body, "<%{NUMBER:syslog_priority}> %{NUMBER:syslog_version} %{TIMESTAMP_ISO8601:timestamp} %{WORD:hostname} %{WORD:appname} %{NUMBER:pid} %{WORD:log_id} - %{GREEDYDATA:message}", true), "upsert") where IsMap(attributes)
set(attributes, ExtractGrokPatterns(body, "<%{NUMBER:syslog_priority}> %{NUMBER:syslog_version} %{TIMESTAMP_ISO8601:timestamp} %{WORD:hostname} %{WORD:appname} %{NUMBER:pid} %{WORD:log_id} - %{GREEDYDATA:message}", true)) where not IsMap(attributes)
From the YAML, you can see the logic applied by the processor:
These two statements ensure that attributes extracted from the log messages are properly incorporated into existing data:
- If attributes is a map, it is enhanced with the new data using the merge_maps function to ensure data integrity through “upsert” operations.
- If attributes is not a map, it is completely replaced by the new data map obtained from the log content, ensuring that the attributes are consistently structured following the Grok extraction.
The output data items has parsed the body using the named captures:
{
"_type": "log",
"timestamp": 1750674290981,
"body": "<14> 1 2025-06-23T10:24:44.849103Z logserver01 monitor 28968 ID167 - New entry",
"resource": {
...
},
"attributes": {
"appname": "monitor",
"hostname": "logserver01",
"log_id": "ID167",
"message": "New entry",
"pid": "28968",
"syslog_priority": "14",
"syslog_version": "1",
"timestamp": "2025-06-23T10:24:44.849103Z"
}
}
Options
Select a telemetry type
You can specify, log
, metric
, trace
or all
. It is specified using the interface, which generates a YAML list item for you under the data_types
parameter. This defines the data item types against which the processor must operate. If data_types is not specified, the default value is all
. It is optional.
It is defined in YAML as follows:
- name: multiprocessor
type: sequence
processors:
- type: <processor type>
data_types:
- log
condition
The condition
parameter contains a conditional phrase of an OTTL statement. It restricts operation of the processor to only data items where the condition is met. Those data items that do not match the condition are passed without processing. You configure it in the interface and an OTTL condition is generated. It is optional. You can select one of the following operators:
Operator | Name | Description | Example |
---|---|---|---|
== |
Equal to | Returns true if both values are exactly the same |
attributes["status"] == "OK" |
!= |
Not equal to | Returns true if the values are not the same |
attributes["level"] != "debug" |
> |
Greater than | Returns true if the left value is greater than the right |
attributes["duration_ms"] > 1000 |
>= |
Greater than or equal | Returns true if the left value is greater than or equal to the right |
attributes["score"] >= 90 |
< |
Less than | Returns true if the left value is less than the right |
attributes["load"] < 0.75 |
<= |
Less than or equal | Returns true if the left value is less than or equal to the right |
attributes["retries"] <= 3 |
matches |
Regex match | Returns true if the string matches a regular expression |
isMatch(attributes["name"], ".*\\.name$" |
It is defined in YAML as follows:
- name: _multiprocessor
type: sequence
processors:
- type: <processor type>
condition: attributes["request"]["path"] == "/json/view"
OTTL Statement
Parse from
This option specifies the field containing the text that needs to be parsed. It is specified using bracket notation and is optional. If left empty it defaults to body
.
Assign to
Specify the field where you want the parsed object to be saved.
Grok Pattern
This option defines the log pattern that should be used to parse attributes. A Pattern or a Custom Pattern is required. Use the Knowledge Library to select a pattern, specify your own a custom pattern, or you use an AI assistant to generate a Grok pattern.
Final
The final
parameter specifies whether successfully processed data items should continue to subsequent processors within the same multiprocessor node. Data items that fail to be processed by the processor will be passed to the next processor in the node regardless of this setting. You select the slider in the tool which specifies it for you in the YAML as a Boolean. The default is false
and it is optional.
It is defined in YAML as follows:
- name: multiprocessor
type: sequence
processors:
- type: <processor type>
final: true
Keep original telemetry item
This option defines whether to delete the original unmodified data item after it is processed. For example, you can keep the original log as well as any metrics generated by an extract metric processor. If you select this option your data volume will increase.