Manage Log Timestamps with Edge Delta

Understand and manage timestamps using Edge Delta.

Overview

Log timestamps are generated by workloads at the moment the log is emitted. This timestamp is reflected in the item[“body”] of the log. However, the item[“timestamp”] attribute is created at the moment the log is ingested by the Edge Delta agent. This can lead to a slight difference between the two timestamps. There are many factors that might affect the difference between the emitted and the ingested timestamps. Here are just two examples:

  • Network Latency and Log Batching Strategies: When logs are transported across a network, especially in distributed systems, network latency can play a significant role. Moreover, if log batching is employed to reduce the number of network requests, this may introduce additional wait times before logs are sent to the Edge Delta agent.

  • Serverless Architectures like AWS Lambda: When using serverless functions such as AWS Lambda, logs may not be emitted immediately. Instead, they are typically batched and processed at the end of the function execution, which can cause a delay in ingestion by the Edge Delta agent.

Consider the following log:

{"timestamp": "2024-04-24T07:06:48.675031Z", "node_id": "node6", "log_level": "INFO", "message": "Scheduled maintenance completed without errors", "source_ip": "REDACTED", "event": "request_received", "service": "api-service"}

This log was created by the workload at 07:06:48 UTC. However, when viewing the same log in Log Search, the following discrepancy can be seen because item["timestamp"] is used by Log Search:

Here is the log selected:

Bear in mind, local time in this example is UTC+2. Nevertheless, the timestamps shows a time difference of 0.347969 seconds. While this may seem like a trivial difference, it could result in logs being viewed out of order in Log Search when sorted by timestamp (UTC+2). This could make it more difficult to determine the root cause of issues highlighted by the logs.

If this is an issue for your environment, consider transforming the item[“timestamp”] to match the item[“body”] timestamp using a Log Transform node, which is configured using CEL macros.

Process Overview

To convert the item[“timestamp”] to the timestamp reflected in item[“body”]:

  1. Evaluate item[“body”] and its timestamp
  2. Write a regex capture CEL macro to capture the item[“body”] timestamp
  3. If necessary, write a CEL macro to convert the item[“body”] timestamp to Unix Milli.
  4. Configure a Log Transform node.

1. Evaluate the Log

Consider the following log:

clickhouse {2024-04-23T18:04:56.630Z} [ {3955} ] {{123456789-987654321}} <{INFO}> ({server}( {demo-application})?|{clickhouse-server}): {Query execution completed}

Note the location of the timestamp, which is important when creating the regex capture, and also its format is ISO 8601, indicating it will need to be converted.

2. Capture the Timestamp

The Log Transform node is configured using CEL macros to identify the value of an upserted field.

A regex_capture CEL macro can be used to capture the timestamp. As a reminder, this is the required syntax:

regex_capture(input string, regexWithCaptureGroups string)

The following macro will capture the timestamp:

regex_capture(item["body"],"clickhouse {(?P<match>.*?)}.*").match
  • item["body"] contains the log string.
  • clickhouse { matches a literal string.
  • (?P<match>.*?) defines the named capture group.
  • } matches the closing curly brace } that follows the timestamp.
  • .* matches any characters following the closing brace till the end of the line.
  • .match is how you access the value of the named capture group “match” in the map returned by the regex_capture function. This is similar to how you access entries in a dictionary or a map by their keys.

The captured timestamp is 2024-04-23T18:04:56.630Z and it is assigned to a capture group “match”.

LLMs can assist in designing CEL macros or regex patterns.

3. Convert the Timestamp

The item[“timestamp”] attribute is specified in Unix Milli so the captured timestamp from the previous step will also need to be converted using the convert_timestamp CEL macro.

As a reminder, this is the required syntax:

convert_timestamp(input location string, input format string, output format string)
  • input location string: This is the location of the input timestamp, in this instance it will be the regex_capture designed in the previous step.
  • input format string: This is an example of the format of the current timestamp. You copy one of the listed examples that matches the input log’s timestamp format.
  • output format string: This defines the required format of the output time.

The following macro will convert the timestamp:

convert_timestamp(regex_capture(item["body"],"clickhouse {(?P<match>.*?)}.*").match, "2006-01-02T15:04:05.000Z", "Unix Milli")

Note how the input location is the regex_capture, the input format is a copy of the sample log’s timestamp, and the format is specified as “Unix Milli”.

4. Configure a Log Transform node

Configure a log transform node as follows:

  1. Click Pipelines.
  2. Select the pipeline in which you want to convert timestamps.
  3. Click Edit Mode.
  4. Click Add Processor.
  5. Expand Transformations and select Log Transform.
  6. Specify a name such as timestamp and click Add New.
  7. Enter item["timestamp"] in the Field Path.
  8. Select Upsert for the Operation.
  9. Enter the convert_timestamp macro for the Value field:
convert_timestamp(regex_capture(item["body"],"clickhouse {(?P<match>.*?)}.*").match, "2006-01-02T15:04:05.000ZZ", "Unix Milli")
  1. Click OK.
  2. Connect the Log Transform node to the pipeline.
  3. Click Review Changes.
  4. Click Deploy Changes.

The item["timestamp"] time will now match the item["body"] time, enabling you to search for and sort by the timestamp at which the log event was emitted from its data source.