I am using Rsyslog for distributed log collection. To my understanding, the flow of logs on Debian starts from journals
then to syslog socket
and then it is picked up by rsyslog clients
. In my case, the RS clients are post-processing the logs collected by adding extra information like time-generated
, priority
, hostname
etc. There are two /etc/rsyslog.d/*.conf
that is used on every node:
- For capturing local logs and showing on terminal (forwarding to spec)
- For forwarding logs to other nodes
Lately, I am running into following issue:
- All nodes have same configurations, the forwarding node is adding the post processing information to the logs. When received, before sending to terminal, the local configuration is also adding the post processing information (making is twice). Certain information like time generated
doesn't get duplicated, but information like hostname is getting printed twice. I have to post process at both times because log server may be external or internal
and the forwarded v/s local logs should look the same.
- Is this the best way to handle distributed log collection?
- How can I avoid duplicate post processing strings?
- Why is
time-generated
not getting duplicated?
Any pointer would really help, thanks!
Specs: Nodes: Debian Jessie, Systemd 215, Rsyslog 8.3.3 Server: Same as above or external rsyslog servers
After debugging/digging into documentations, here were my observations:
Rsyslog parser
does its best to identify the log format, especially the header component.%time-generated%
, if identified correctly, will not be duplicated at relay/collector.<%PRI%>1 %TIMESTAMP:::date-rfc3339% %HOSTNAME% %APP-NAME% %PROCID% %MSGID% %STRUCTURED-DATA% %msg%\n
RSYSLOG_SyslogProtocol23Format
and every thing seems to be fine now.Hope this answers everything. Please feel free to discuss this question further :-) A very good article that helped me understand the nature of rsyslog parser is this