Logstash stopped processing because of an error: (SystemExit) exit

3k views Asked by At

We are trying to index Nginx access and error log separately in Elasticsearch. for that we have created Filbeat and Logstash config as below.

Below is our /etc/filebeat/filebeat.yml configuration

filebeat.inputs:
- type: log
  paths:
    - /var/log/nginx/*access*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type1 
- type: log
  paths:
    - /var/log/nginx/*error*.log
  exclude_files: ['\.gz$']
  exclude_lines: ['*ELB-HealthChecker*']
  fields:
    log_type: type2

output.logstash:
  hosts: ["10.227.XXX.XXX:5400"]

Our logstash file /etc/logstash/conf.d/logstash-nginx-es.conf config is as below

input {
    beats {
        port => 5400
    }
}

filter {
  if ([fields][log_type] == "type1") {
    grok {
      match => [ "message" , "%{NGINXACCESS}+%{GREEDYDATA:extra_fields}"]
      overwrite => [ "message" ]
    }
    mutate {
      convert => ["response", "integer"]
      convert => ["bytes", "integer"]
      convert => ["responsetime", "float"]
    }
    geoip {
      source => "clientip"
      target => "geoip"
      add_tag => [ "nginx-geoip" ]
    }
    date {
      match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
      remove_field => [ "timestamp" ]
    }
    useragent {
      source => "user_agent"
    }
  } else {
      grok {
        match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))"(, upstream: "%{GREEDYDATA:upstream}")?, host: "%{DATA:host}"(, referrer: "%{GREEDYDATA:referrer}")?"]
        overwrite => [ "message" ]
      }
      mutate {
        convert => ["response", "integer"]
        convert => ["bytes", "integer"]
        convert => ["responsetime", "float"]
      }
      geoip {
        source => "clientip"
        target => "geoip"
        add_tag => [ "nginx-geoip" ]
      }
      date {
        match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
        remove_field => [ "timestamp" ]
      }
      useragent {
        source => "user_agent"
      }
    }
}

output {
  if ([fields][log_type] == "type1") {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "nginx-access-logs-%{+YYYY.MM.dd}"
    }
} else {
    amazon_es {
      hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
      region => "ap-southeast-1"
      aws_access_key_id => 'XXXX'
      aws_secret_access_key => 'XXXX'
      index => "nginx-error-logs-%{+YYYY.MM.dd}"
    }
  }
    stdout { 
      codec => rubydebug 
    }
}

And we are receiving below error while starting logstash.

[2020-10-12T06:05:39,183][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.9.2", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.265-b01 on 1.8.0_265-b01 +indy +jit [linux-x86_64]"}
[2020-10-12T06:05:39,861][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-10-12T06:05:41,454][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \",\", \"]\" at line 32, column 263 (byte 918) after filter {\n  if ([fields][log_type] == \"type1\") {\n    grok {\n      match => [ \"message\" , \"%{NGINXACCESS}+%{GREEDYDATA:extra_fields}\"]\n      overwrite => [ \"message\" ]\n    }\n    mutate {\n      convert => [\"response\", \"integer\"]\n      convert => [\"bytes\", \"integer\"]\n      convert => [\"responsetime\", \"float\"]\n    }\n    geoip {\n      source => \"clientip\"\n      target => \"geoip\"\n      add_tag => [ \"nginx-geoip\" ]\n    }\n    date {\n      match => [ \"timestamp\" , \"dd/MMM/YYYY:HH:mm:ss Z\" ]\n      remove_field => [ \"timestamp\" ]\n    }\n    useragent {\n      source => \"user_agent\"\n    }\n  } else {\n      grok {\n        match => [ \"message\" , \"(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \\[%{LOGLEVEL:severity}\\] %{POSINT:pid}#%{NUMBER:threadid}\\: \\*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:183:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:44:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:357:in `block in converge_state'"]}
[2020-10-12T06:05:41,795][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-10-12T06:05:46,685][INFO ][logstash.runner          ] Logstash shut down.
[2020-10-12T06:05:46,706][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

There seems to be some formatting issue. Please help what is the problem

=================================UPDATE===================================

For all those who are looking for a robust grok filter for nginx access and error logs ... please try below filter patterns.

Access_Logs - %{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:access_time}\] \"%{WORD:http_method} %{URIPATHPARAM:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{SPACE:referrer}\" \"%{DATA:agent}\" %{NUMBER:duration} req_header:\"%{DATA:req_header}\" req_body:\"%{DATA:req_body}\" resp_header:\"%{DATA:resp_header}\" resp_body:\"%{GREEDYDATA:resp_body}\"

Error_Logs - (?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{DATA:errormessage}, client: %{IP:client}, server: %{IP:server}, request: \"(?<httprequest>%{WORD:httpcommand} %{NOTSPACE:httpfile} HTTP/(?<httpversion>[0-9.]*))\", host: \"%{NOTSPACE:host}\"(, referrer: \"%{NOTSPACE:referrer}\")?

1

There are 1 answers

2
karan shah On BEST ANSWER

Grok pattern on line 32 is the issue. Need to escape all " characters. Below is an escaped version of the GROK.

grok {
        match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME})\[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))\"(, upstream: \"%{GREEDYDATA:upstream}\")?, host: \"%{DATA:host}\"(, referrer: \"%{GREEDYDATA:referrer}\")?"]
        overwrite => [ "message" ]
      }