ParseXML

Learn about the ParseXML OTTL converter function.

The ParseXML converter parses an XML string into structured data. Unlike ParseSimplifiedXML it maintains the entire structure of the XML document, including attributes, content within tags, and the hierarchy of elements.

Syntax: ParseXML(value)

  • value: the bracket notation location of the XML string to parse

Input

{
	"_type": "log",
	"timestamp": 1734571176415,
	"body": "<log type=\"access\"><!-- Log entry for a web request --><details><host>172.17.15.39</host><userIdentifier>68b148de-7ce3-423c-b72d-64a4f21ecfc0</userIdentifier><timeLocal>2024-12-15T22:40:53.723160Z</timeLocal></details><requestInfo><method>POST</method><request>/styles/main.css</request><protocol>HTTP/2</protocol></requestInfo><response><status>403</status><bytesSent>1043</bytesSent></response><message>This is a sample log entry</message></log>",
	"resource": {...},
	"attributes": {
		"decoded_body": "<log type=\"access\"><!-- Log entry for a web request --><details><host>172.17.15.39</host><userIdentifier>68b148de-7ce3-423c-b72d-64a4f21ecfc0</userIdentifier><timeLocal>2024-12-15T22:40:53.723160Z</timeLocal></details><requestInfo><method>POST</method><request>/styles/main.css</request><protocol>HTTP/2</protocol></requestInfo><response><status>403</status><bytesSent>1043</bytesSent></response><message>This is a sample log entry</message></log>"
	}
}

Statement

set(attributes["map"], ParseXML(attributes["decoded_body"]))

Output

{
	"_type": "log",
	"timestamp": 1734571204472,
	"body": "<log type=\"access\"><!-- Log entry for a web request --><details><host>172.17.15.39</host><userIdentifier>68b148de-7ce3-423c-b72d-64a4f21ecfc0</userIdentifier><timeLocal>2024-12-15T22:40:53.723160Z</timeLocal></details><requestInfo><method>POST</method><request>/styles/main.css</request><protocol>HTTP/2</protocol></requestInfo><response><status>403</status><bytesSent>1043</bytesSent></response><message>This is a sample log entry</message></log>",
	"resource": {...},
	"attributes": {
		"decoded_body": "<log type=\"access\"><!-- Log entry for a web request --><details><host>172.17.15.39</host><userIdentifier>68b148de-7ce3-423c-b72d-64a4f21ecfc0</userIdentifier><timeLocal>2024-12-15T22:40:53.723160Z</timeLocal></details><requestInfo><method>POST</method><request>/styles/main.css</request><protocol>HTTP/2</protocol></requestInfo><response><status>403</status><bytesSent>1043</bytesSent></response><message>This is a sample log entry</message></log>",
		"map": {
			"attributes": {
				"type": "access"
			},
			"children": [
				{
					"children": [
						{
							"content": "172.17.15.39",
							"tag": "host"
						},
						{
							"content": "68b148de-7ce3-423c-b72d-64a4f21ecfc0",
							"tag": "userIdentifier"
						},
						{
							"content": "2024-12-15T22:40:53.723160Z",
							"tag": "timeLocal"
						}
					],
					"tag": "details"
				},
				{
					"children": [
						{
							"content": "POST",
							"tag": "method"
						},
						{
							"content": "/styles/main.css",
							"tag": "request"
						},
						{
							"content": "HTTP/2",
							"tag": "protocol"
						}
					],
					"tag": "requestInfo"
				},
				{
					"children": [
						{
							"content": "403",
							"tag": "status"
						},
						{
							"content": "1043",
							"tag": "bytesSent"
						}
					],
					"tag": "response"
				},
				{
					"content": "This is a sample log entry",
					"tag": "message"
				}
			],
			"tag": "log"
		}
	}
}

The ParseXML function processed decoded_body and mapped its structure into a JSON-compatible hierarchical object. Attributes, such as type="access", were extracted and stored in an attributes dictionary for the corresponding tag. Content within tags, like "172.17.15.39" in <host>, was placed under a content key for each tag object. Comments, such as Log entry for a web request, are ignored. The nested XML structure was reflected as nested children arrays, preserving the hierarchy of elements like <details>, <requestInfo>, and <response>. Each tag was identified by its tag key, with its immediate children listed under children.