I have a file that contains line separated json objects as well as non json data (stderr stacktraces).
{"timestamp": "20170104T17:10:39", "retry": 0, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:40", "retry": 1, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:41", "retry": 2, "level": "info", "event": "failed to download"}
Traceback (most recent call last):
File "a.py", line 12, in <module>
foo()
File "a.py", line 10, in foo
bar()
File "a.py", line 4, in bar
raise Exception("This was unexpected")
Exception: This was unexpected
{"timestamp": "20170104T17:10:42", "retry": 3, "level": "info", "event": "failed to download"}
{"timestamp": "20170104T17:10:43", "retry": 4, "level": "info", "event": "failed to download"}
Using the following config, I'm able to get the valid json lines properly but the invalid json is being sent individualy (line by line).
filebeat.yml
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
output:
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "failed to download",
"level": "info",
"retry": 2,
"timestamp": "20170104T17:10:41"
},
"offset": 285,
"source": "/tmp/test.log",
"type": "mytype"
}
{
"@timestamp": "2017-01-04T12:03:36.659Z",
"beat": {
"hostname": "...", "name": "...", "version": "5.1.1"
},
"input_type": "log",
"json": {
"event": "Traceback (most recent call last):",
"json_error": "Error decoding JSON: invalid character 'T' looking for beginning of value"
},
"offset": 320,
"source": "/tmp/test.log",
"type": "mytype"
}
I want to club all the non json lines until a new json line into one message.
Using multiline, I tried the following
filebeat.prospectors:
- input_type: log
document_type: mytype
json:
message_key: event
add_error_key: true
paths:
- /tmp/*.log
multiline:
pattern: '^{'
negate: true
match: after
output:
console:
pretty: true
file:
path: "/tmp/filebeat"
filename: filebeat
But it doesn't seem to be working. Its performing the multiline rules on the values of event
key, which was specified in json.message_key
.
From the docs here I understand why that is happening
json.message_key
-
JSON key on which to apply the line filtering and multiline settings. This key must be top level and its value must be string, otherwise it is ignored. If no text key is defined, the line filtering and multiline features cannot be used.
Is there any other way to club consecutive non json lines into a single message ?
I'd like the entire stack trace to be captured before it sends it to logstash.
Filebeat applies the multiline grouping after the JSON parsing so the multiline pattern cannot be based on the characters that make up the JSON object (e.g.
{
).In Filebeat there is another way to do JSON parsing such that the JSON parsing occurs after the multiline grouping so your pattern can include the JSON object characters. You need Filebeat 5.2 (soon to be released) because the
target
field was added to the decode_json_fields processor so that you can specify where the decoded json fields will be added to the event.I tested the multiline pattern here using the Golang playground.
Filebeat produces the following output (using the log lines you gave above as the input). (I used a build from the master branch.)