google-fluentd : change severity in Cloud Logging log_level

1.3k views Asked by At

We are running spark jobs (lot of spark streaming) on Google cloud Dataproc clusters.

we are using cloud logging to collect all the logs generated by spark jobs. currently it is generating lot of "INFO" messages which causes the whole log volumes to size of few TBs.

I want to edit the google-fluentd config to restrict the log level to "ERROR" level instead of "INFO".

tried to set the config as "log_level error" , but did not work. also its mentioned in the comment section in /etc/google-fluentd/google-fluentd.conf as # Currently severity is a seperate field from the Cloud Logging log_level.

# Fluentd config to tail the hadoop, hive, and spark message log.
# Currently severity is a seperate field from the Cloud Logging log_level.
<source>
    type tail
    format multi_format
    <pattern>
        format /^((?<time>[^ ]* [^ ]*) *(?<severity>[^ ]*) *(?<class>[^ ]*): (?<message>.*))/
/etc/google-fluentd/google-fluentd.conf/etc/google-fluentd/google-fluentd.conf/etc/google-fluentd/google-fluentd.conf        time_format %Y-%m-%d %H:%M:%S,%L
    </pattern>
    <pattern>
        format none
    </pattern>
    path /var/log/hadoop*/*.log,/var/log/hadoop-yarn/userlogs/**/stderr,/var/log/hive/*.log,/var/log/spark/*.log,
    pos_file /var/tmp/fluentd.dataproc.hadoop.pos
    refresh_interval 2s
    read_from_head true
    tag raw.tail.*
</source>
1

There are 1 answers

0
John Mikula On

Correct. As the comment states, @log_level and severity are not the same, which is confusing at best. @log_level configures the verbosity for the logger of the component, whereas severity is the field that Stackdriver Logging ingests.

In order to make fluentd exclude any severity below ERROR you can add a grep filter to /etc/google-fluentd/google-fluentd.conf that explicitly excludes these by name.

At some point before the <match **> block add the following:

<filter raw.tail.**>
    @type grep
    exclude1 severity (DEBUG|INFO|NOTICE|WARNING)
</filter>

Which will check the record for the severity field and reject it if the value matches the regex.