Distinct list of values from embedded document in mongodb

233 views Asked by At

I'm going crazy... I come from the sql world an this is my first real experience with mongodb.

I have a given json/object structure (I know, this is not perfect but it has to be so because of existing data and other applications), stored in a mongodb (v3.4) with Restheart as the http frontend.

The documents look like this

{
    "_id" : ObjectId("5855cbc9fc3baea81e937261"),
    "_etag" : ObjectId("5855cbc99b971700050d8adc"),
    "log" : [
        "858489b087f7472dbb8d1012dee4cd5d.d",
        NumberLong("12345678901234"),
        {
            "ext" : {
                "text" : "someone did something at somewhere (some street 123) +38 USD",
                "markup" : [
                    [
                        "user",
                        {
                            "plain" : "someone",
                            "team" : "master"
                        }
                    ], [
                        "TEXT",
                        {
                            "plain" : " did something at "
                        }
                    ], [
                        "location",
                        {
                            "name" : "somewhere",
                            "plain" : "some street 123",
                            "team" : "master"
                        }
                    ], [
                        "TEXT",
                        {
                            "plain" : "38"
                        }
                    ], [
                        "TEXT",
                        {
                            "plain" : "USD"
                        }
                    ]
                ],
                "category" : 1,
                "team" : "master"
            }
        }
    ]
}

And I want to get a distinct list of the usern.plain names. Theoretically db.logs.distinct("log.2.ext.markup.0.1.plain") would do exactly what I need. But as far as I understand, there is no way to use db.distinct with Restheart. I tried this using views but it seems, I can not use db.distinct in views too.

This are my experiments...

{ "aggrs" : [
  { "stages" : [
        { "_$project" : { "user" : "$log.2.ext.markup.0.1.plain"}},
        { "_$unwind" : "$user"},
        { "_$unwind" : "$user"},
        { "_$unwind" : "$user"},
        { "_$unwind" : "$user"},
        { "_$unwind" : "$user"},
        { "_$unwind" : "$user"},
        { "_$group" : { "_id" : "$user"}}
      ],
    "type" : "pipeline",
    "uri" : "unique_users1"
  },
  { "stages" : [
        { "_$match": { "log": { "_$exists": true, "_$ne": null }}},
        { "_$unwind" : "$log"},
        { "_$unwind" : "$log.2"},
        { "_$unwind" : "$log.2.ext"},
        { "_$unwind" : "$log.2.ext.markup"},
        { "_$unwind" : "$log.2.ext.markup.0"},
        { "_$unwind" : "$log.2.ext.markup.0.1"},
        { "_$group" : { "_id" : "$log.2.ext.markup.0.1.plain"}}
      ],
    "type" : "pipeline",
    "uri" : "unique_users2"
  },
  { "stages" : [
        { "_$match" : {"log" : { "_$exists" : true }}},
        { "_$replaceRoot" : { "newRoot" : { "user": "$log.2.ext.markup.0.1.plain"}}}
      ],
      "type" : "pipeline",
      "uri" : "unique_users3"
  },
  { "stages" : [
        { "_$group" : { "_id" : 1 , "users" : { "_$addToSet" : "$log.2.ext.markup.0.1.plain"}}}
      ],
    "type" : "pipeline",
    "uri" : "unique_users4"
  }
]}

But results are... nothing or nearly nothing

{
"_embedded": {
    "rh:result": [
        {
            "_id": 1,
            "users": [
                []
            ]
        }
    ]
},
"_returned": 1,
"_size": 1,
"_total_pages": 1
}
0

There are 0 answers