Logstash in k8s - parsing nested json from MongoDB and get every nested json as separated field

37 views Asked by Denis At 06 September 2023 at 07:33

I'm using Logstash to take documents from specific MongoDB collection, and save it to Elasticsearch. Nested fields are being saved to "log_entry" as one JSON, starting with "BSON" or "ID", depends on manipulations I do using filter.

Here is example of the "log_entry":

"log_entry": {\"_id\": \"122ghgh1111, \"msg_body\": {\"text_one\": 2, \"text_data\": [{\"position\": 1}, {....}, {...}]}}

there is a lot of text in log_entry, so i don't post everything, just the structure.

Below is my config (ive tried different ways , so I'll share it all. None of them isn't doing what I'd like to achieve):

logstashPipeline:
   logstash.conf:
      input {
         mongodb {
            uri => 'mongodb://user:password@host:port/<db_name>?directConnection=true'
            placeholder_db_dir => '/opt/logstash-mongodb'
            placeholder_db_name => 'logstash_sqlite.db'
            collection => 'my_collection'
         }
      }
      // First try - still saving nested JSON as one

      filter {
         mutate {
            gsub => [ "log_entry", "=>", ": "]
            rename => { "_id" => "mongo_id" }
            remove_filed => ["_id"]
         }
         mutate {
            gsub => [ "log_entry", "BSON::ObjectID\('([0-9a-z]+'\)", '"\1']
            rename => { "_id" => "mongo_id" }
         }
      }
      // Second try - still saving nested JSON as one


      filter {
         mutate {
            rename => { "_id" => "mongo_id" }
         }
         grok {
            match => { "log_entry" => "%{WORD}\\\"\:\s\\\"%{WORD}\,\s\\\"%{WORD}\\\"\:\s\{\\\"%{WORD}\\\"\:\s%{WORD:field}" }
         }
      }
      output { elasticsearch {
            action => "index"
            index => "mongo_log_data"
            hosts => ["https://<host>:9200"]
            ssl => false
            ssl_certificate_verification => false
            user => "elastic"
            password => "some_password"
         }
      }

I would like it to be saved in ES as:

text_one: 2
text_data_position: 1

etc

The problem of the using Grok is that i dont know how many nested fields are stored in a document, so I dont know how to build correctly the grok regex. I mean, what I currently build is just catching one field, i of course can add more regex patterns but I would like to be the grok pattern as more dynamic as possible.

Can you please help me to build a correct working grok pattern to achieve what I need?

Thanks in advance.

UPD:

I'be found some good regex that do what I need:

gives me exactly what I need, but when Im trying it in Grok it captures only first field:

Original Q&A

TechQA.

Logstash in k8s - parsing nested json from MongoDB and get every nested json as separated field

There are 0 answers

Related Questions in JSON

Related Questions in REGEX

Related Questions in MONGODB

Related Questions in LOGSTASH

Related Questions in GROK

Popular Questions

Trending Questions