Edge Delta File Source
5 minute read
Overview
The File source node captures log input from specific files. It is useful when dealing with system logs that are written to flat files on disk. These messages are sent as logs into the pipeline by the agent as they arrive in the file. It also serves as a valuable tool for testing and troubleshooting.
For a detailed walkthrough, see the Ingest Logs from a File page.
- outgoing_data_types: log
Example Configuration
In this example the file_input node monitors one particular file. You can use wildcards to tail multiple files. If you use wildcards you can exclude specific files.
nodes:
- name: my_file_input
type: file_input
path: "/mnt/inputfile/logs/inputfile.log"
Bear in mind that you need to enable access for the agent to this location by configuring the path in the agent manifest as a volume and a volumemount. See here for details on adding a volume and a volumemount.
Required Parameters
name
A descriptive name for the node. This is the name that will appear in Visual Pipelines and you can reference this node in the YAML using the name. It must be unique across all nodes. It is a YAML list element so it begins with a -
and a space followed by the string. It is a required parameter for all nodes.
nodes:
- name: <node name>
type: <node type>
type: file_input
The type
parameter specifies the type of node being configured. It is specified as a string from a closed list of node types. It is a required parameter.
nodes:
- name: <node name>
type: <node type>
path
The path
parameter specifies the file paths to tail for log messages. It is a string. A path
is required. Wildcards are supported:
/etc/systemd/system/billingservice/*.log
includes all.log
files in thebillingservice
folder./etc/systemd/system/billingservice/**/*.log
includes all.log
files in thebillingservice
folder and any sub-folders.
nodes:
- name: <node name>
type: file_input
path: "/var/logs/app/*.log"
Optional Parameters
add_ingestion_time
The add_ingestion_time
parameter specifies whether to ingest the timestamp. The input must be JSON. It is specified with a Boolean, the default is false
and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
add_ingestion_time: true
auto_detect_line_pattern
The auto_detect_line_pattern
parameter is a Boolean value that, when set to true
, enables automatic detection of line patterns in logs. This automatically determine the structure of log lines, rather than relying on pre-defined patterns.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
auto_detect_line_pattern: true
The auto_detect_line_pattern
parameter is optional and defaults to false
if not specified.
boost_stacktrace_detection
The boost_stacktrace_detection
parameter is used with auto_detect_line_pattern: true
. It enables stack trace detection based on the Ragel FSM Based Lexical Recognition process. Found stack traces will be grouped together in the same log message. It is specified with a Boolean, the default is false
and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
auto_detect_line_pattern: true
boost_stacktrace_detection: true
docker_mode
The docker_mode
parameter specifies whether to extract the log
field as a singe input item from container logs in JSON format. It is specified with a Boolean, the default is false
and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
docker_mode: true
enable_persisting_cursor
The enable_persisting_cursor
parameter is used to enable a persisting cursor in the event of an agent restart. It is specified as a Boolean, the default is false
and it is optional.
nodes:
- name: <node name>
type: file_input
path: "/var/logs/app/*.log"
enable_persisting_cursor: true
exclude
The exclude
parameter specifies a Golang regex pattern to match file paths to exclude from tailing for log messages. These are files that might otherwise be tailed as they are located within the path
parameter. The exclude
parameter is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
exclude: /var/logs/app/test.*\.log
In this example:
/var/logs/app/
specifically matches this directory path.test
matches files starting with the word “test”..*
matches any characters (0 or more) after “test”.\.log
matches the file extension .log (the backslash escapes the dot because a dot has a special meaning in regex, which is to match any single character).
line_pattern
The line_pattern
parameter specifies a Golang regex pattern that will be used as a line break rather than using a new line. It is specified with a Golang regex pattern, including a ^ for the line start, and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
line_pattern: ^\d{4}-\d{2}-\d{2}
sampling
The sampling
parameter specifies the sampling rate for the input payloads. It is specified with a Boolean with false being 0.0
and true being 1.0
, the default is false
, and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
sampling: true
separate_source
The separate_source
parameter allows the separation of log sources for files located in the same directory path. It is specified with a Boolean, the default is false
, and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
separate_source: true
skip_ingestion_time_on_failure
The skip_ingestion_time_on_failure
parameter skips ingestion of the timestamp when the input is broken or an invalid format. It is used with add_ingestion_time
. It is specified with a Boolean, the default is false
and it is optional.
nodes:
- name: my_file_input
type: file_input
path: "/var/logs/app/*.log"
add_ingestion_time: true
skip_ingestion_time_on_failure: true