Syslog Pack
6 minute read
Edge Delta Pipeline Pack for Syslog
Overview
The Edge Delta Syslog pack processes syslog messages by normalizing whitespace, parsing timestamps, and enriching logs with metadata to facilitate monitoring, searching, and alerting. It also attempts to fill in missing metadata based on host information. You can configure the pack to process either RFC5424 logs (default) or you can have it process RFC3164 and Linux format logs.
Pack Description
1. Data Ingestion
The data flow starts with the Source as the entry point into the pack.
2. Replace #011 with Space
The first transformation step is handled by the Replace #011 with space node, a Mask node.
- name: 'Replace #011 with space'
type: mask
pattern: '#011'
mask: ' '
This replaces tab character representations “#011” with actual spaces. This change helps standardize the log format.
3. Replace Multiple Whitespace with Space
The next node is Replace multiple whitespace with space, another Mask node.
- name: Replace multiple whitespace with space
type: mask
pattern: \s\s+
mask: ' '
It condenses multiple whitespace characters into a single space. This results in cleaner logs.
4. Remove Leading Space
The third node, Remove leading space, is also a Mask node.
- name: Remove leading space
type: mask
pattern: ^\s+
mask: ""
It removes any leading spaces from the log messages. This step ensures the messages are uniformly aligned.
From here, logs flow by default to Parse RFC5424 Format. However, you can edit the pack to rather flow to the Parse RFC3164 and Linux Format node, which is currently orphaned.
5.1. Parse RFC5424 Format
If you keep the default pack configuration, logs flow from Remove Leading Space to this node, Parse RFC5424 format, a Grok node.
- name: Parse RFC5424 format
type: grok
pattern: <%{POSINT:pri}>%{POSINT:version}%{SPACE}%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{SYSLOGHOST:host}%{SPACE}%{SYSLOG5424PRINTASCII:appname}%{SPACE}%{SYSLOG5424PRINTASCII:procid}%{SPACE}%{SYSLOG5424PRINTASCII:msgid}%{SPACE}(?:-|(?<structuredData>(\[.*?[^\\]\])+))%{SPACE}%{GREEDYDATA:message}
It applies a Grok pattern to extract fields from logs formatted according to the RFC5424 standard. Parsing based on RFC5424 enables you to extract structured data from logs.
5.2. Extract Timestamp ISO8601
Logs on the RFC5424 path continue to the Extract Timestamp ISO8601, an OTTL Transform node.
- name: Extract Timestamp ISO8601
type: ottl_transform
statements: set(timestamp, UnixMilli(Time(attributes["timestamp"], "%Y-%m-%dT%H:%M:%SZ")))
This node converts the extracted timestamp into a standardized Unix Millisecond format as follows:
set: This function is used to assign a value to thetimestampfield.attributes["timestamp"]: This is accessing a value that is stored in theattributesmap with the keytimestamp.Time: This function converts a string representation of a time to atime.Timeobject. The format used here is%Y-%m-%dT%H:%M:%SZ, which corresponds to the ISO 8601 standard format for representing date and time (e.g.,2023-01-01T00:00:00Z).UnixMilli: This function converts atime.Timeobject into a Unix time.
This step is crucial for ensuring that log entries are accurately correlated across distributed systems. See Manage Log Timestamps with Edge Delta.
6.1. Parse RFC3164 and Linux Format
If you edit the configuration for RFC3164 and Linux Format, logs flow from Remove Leading Space to this node, Parse RFC3164 and Linux format, a Grok node.
- name: Parse RFC3164 and Linux format
type: grok
pattern: (<%{POSINT:pri}>)?%{SPACE}%{SYSLOGTIMESTAMP:timestamp}%{SPACE}%{SYSLOGHOST:host}%{SPACE}%{DATA:appname}(\[%{POSINT:procid}\])?:%{GREEDYDATA:message}
It extracts information from older syslog formats and parses it into structured data. Supporting RFC3164 ensures backward compatibility with legacy systems.
6.2. Extract Timestamp Syslog
Logs on the RFC3164 and Linux Format path continue to the Extract Timestamp Syslog, is another OTTL Transform node.
- name: Extract Timestamp Syslog
type: ottl_transform
statements: set(timestamp, UnixMilli(Time(attributes["timestamp"], "%b %d %H:%M:%S")))
This node converts timestamps from syslog-format timestamps as follows:
set: This function assigns a value to a specified telemetry field. In this case, it is setting thetimestampfield.attributes["timestamp"]: This is accessing a value in theattributesmap with the keytimestamp.Time: This function converts a string representation of time into atime.Timeobject. The format used here is%b %d %H:%M:%S, where:%bis the abbreviated month name (e.g., Jan, Feb).%dis the day of the month as zero-padded number.%H:%M:%Srepresents the hour (24-hour format), minute, and second, respectively.
UnixMilli: This function converts thetime.Timeobject into a Unix time.
Utilizing a consistent timestamp format is vital for reliable event sorting and analysis. See Manage Log Timestamps with Edge Delta.
7. Lookup by Host
All logs, whether configured to flow on the RFC5424 path or the RFC3164 and Linux Format path are routed to Lookup by Host, a Lookup node. When you add this pack, a lookup table is automatically added to your lookup library in Edge Delta.
| host | index | sourcetype | source |
|---|---|---|---|
| myhostname | index-k | linux-syslog | linux |
You can edit the table to add your environment data.
- name: Lookup by host
type: lookup
location_path: ed://syslog_lookup.csv
reload_period: 5m0s
match_mode: exact
regex_option: first
key_fields:
- event_field: item["attributes"]["host"]
lookup_field: host
out_fields:
- event_field: item["attributes"]["index"]
lookup_field: index
- event_field: item["attributes"]["sourcetype"]
lookup_field: sourcetype
- event_field: item["attributes"]["source"]
lookup_field: source
It enriches logs by matching the host to external data sources, filling fields like index, sourcetype, and source. The enrichment is valuable for categorizing logs and speeding up query time by allowing filters on frequently indexed fields.
8. Lookup by Source Host
The Lookup by Source Host node uses a similar Lookup operation.
- name: Lookup by source host
type: lookup
location_path: ed://syslog_lookup.csv
reload_period: 5m0s
match_mode: exact
regex_option: first
key_fields:
- event_field: item["resource"]["host.ip"]
lookup_field: host
out_fields:
- event_field: item["attributes"]["source_host"]["source"]
lookup_field: source
- event_field: item["attributes"]["source_host"]["sourcetype"]
lookup_field: sourcetype
- event_field: item["attributes"]["source_host"]["index"]
lookup_field: index
It attempts to find metadata based on the host.ip resource, rather than the host attribute. This redundancy ensures that if metadata is missing, the system has an alternate path to potentially fill it.
9. Fill Missing Metadata
Finally, logs are processed by the Fill Missing Metadata node, an OTTL Transform node.
- name: Fill missing metadata
type: ottl_transform
statements: |-
set(attributes["source"], attributes["source_host"]["source"]) where attributes["source"] == nil or attributes["source"] == ""
set(attributes["source"], resource["host.ip"]) where attributes["source"] == nil or attributes["source"] == ""
set(attributes["index"], attributes["source_host"]["index"]) where attributes["index"] == nil or attributes["index"] == ""
set(attributes["index"], "syslog") where attributes["index"] == nil or attributes["index"] == ""
set(attributes["sourcetype"], attributes["source_host"]["sourcetype"]) where attributes["sourcetype"] == nil or attributes["sourcetype"] == ""
set(attributes["sourcetype"], "syslog") where attributes["sourcetype"] == nil or attributes["sourcetype"] == ""
delete_key(attributes, "source_host")
This node fills in any remaining gaps with default values, ensuring all logs have the necessary metadata for efficient querying and storage. The statements work as follows:
- Sets the
attributes["source"]field to the value ofattributes["source_host"]["source"]ifattributes["source"]is eithernil(not set) or an empty string. - Sets the
attributes["source"]field toresource["host.ip"]ifattributes["source"]isnilor an empty string. - Sets
attributes["index"]toattributes["source_host"]["index"]ifattributes["index"]isnilor an empty string. - Sets
attributes["index"]to the literal stringsyslogifattributes["index"]isnilor an empty string. - Sets
attributes["sourcetype"]toattributes["source_host"]["sourcetype"]ifattributes["sourcetype"]isnilor an empty string. - Sets
attributes["sourcetype"]tosyslogifattributes["sourcetype"]isnilor an empty string. - Removes the
source_hostkey from theattributesmap.
This ensures that logs have a baseline of metadata for later analysis or alerting.
10. Processed Output
The processed logs are finally routed to the Processed compound output for downstream handling in the pipeline.
Sample Input
<165>1 2003-10-11T22:14:15.003Z myhostname myapp 1234 ID47 - [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] An application event log entry...