MongoDB geospatial search based on many dimensions

97 views Asked by At

Totally stumped on this one. Say my database has documents which contain a field called userTags, which is an object that looks like this:

{
  name: 'obj1',
  userTags: {
    "foo" : 5,
    "bar" : 30,
    "aaa" : 15,
    "bbb" : 21,
    "ccc" : 23
  }
}

When I query the userTags field, I want to do it based on the tags that a particular user supplies. For instance, a user might have the following tags on his account:

var tagsToMatch = {
    "foo" : 44,
    "bar" : 18,
    "aaa" : 45,
    "bbb" : 10,
    "ggg" : 5,
    "mmm" : 90
  }

Note that these example tags are all arbitrary. The search could have 2 tags, it could have 5,000 tags, it's all user-defined and not something I can really control unfortunately. I could maybe write a script that cuts off all but the top 5-10 tags but I wouldn't want to go lower than that.

At the moment I'm just doing a sort() function based on the most-counted tags, e.g:

{'userTags.aaa': -1, 'userTags.foo': -1, 'userTags.bar': -1, 'userTags.bbb': -1, 'userTags.ccc': -1}

This kinda works, for the most-part, but I want something a little bit more tailored to the user in question. For instance, this spits out results in order of aaa, without giving any weight to foo, even though from the user's perspective foo is almost as important as aaa.

Geospatial indexing seems like the best option by far. However, I have two major issues here:

  1. I can't ensureIndex on userTags.[tagname] because these are user-defined, there are thousands of them and they're ever-changing.
  2. From what I can see geospatial indexing only works on two dimensions.

What are my options here? I've never used Mongo's geospatial feature so I may be missing the point entirely, can I just index userTags as a whole and run geospatial searches on the tags it contains?

1

There are 1 answers

0
Vladimir Muzhilov On

If I understand correctly, you create a custom tag cloud.

I suggest you change the storage scheme to a more appropriate to given problem

collections:

  db.users {
    name:"username",
    tag:["tag_value1", "tag_value2", ...]
  }
  db.tags {
    tag: "tag_value1",
    cnt: 1,
    user:"username"
  }

update/insert tag for particular user will be:

db.users.update({user:'user1', tags:{$in:['tag1']}}, {'$addToSet':{tag:'tag1'}}, {upsert:1})
db.tags.update({user:'user1', tag:'tag1'}, {$inc:{cnt:1}}, {upsert:1})

if you don't have user this user will be created with the particular tag if you don't have tag this tag wiil be created or update count of tag for particular user, in tags array will inserting only unique tags

tags array for each user may be good indexed ;)

query after that will be simple:

db.tags.find({user:"user1"}, {tags:1, _id:0}).sort({cnt:-1})

or you can used aggregation framework for more complexity query, for example query with grouping by user