Totally stumped on this one. Say my database has documents which contain a field called userTags
, which is an object that looks like this:
{
name: 'obj1',
userTags: {
"foo" : 5,
"bar" : 30,
"aaa" : 15,
"bbb" : 21,
"ccc" : 23
}
}
When I query the userTags
field, I want to do it based on the tags that a particular user supplies. For instance, a user might have the following tags on his account:
var tagsToMatch = {
"foo" : 44,
"bar" : 18,
"aaa" : 45,
"bbb" : 10,
"ggg" : 5,
"mmm" : 90
}
Note that these example tags are all arbitrary. The search could have 2 tags, it could have 5,000 tags, it's all user-defined and not something I can really control unfortunately. I could maybe write a script that cuts off all but the top 5-10 tags but I wouldn't want to go lower than that.
At the moment I'm just doing a sort()
function based on the most-counted tags, e.g:
{'userTags.aaa': -1, 'userTags.foo': -1, 'userTags.bar': -1, 'userTags.bbb': -1, 'userTags.ccc': -1}
This kinda works, for the most-part, but I want something a little bit more tailored to the user in question. For instance, this spits out results in order of aaa
, without giving any weight to foo
, even though from the user's perspective foo
is almost as important as aaa
.
Geospatial indexing seems like the best option by far. However, I have two major issues here:
- I can't
ensureIndex
onuserTags.[tagname]
because these are user-defined, there are thousands of them and they're ever-changing. - From what I can see geospatial indexing only works on two dimensions.
What are my options here? I've never used Mongo's geospatial feature so I may be missing the point entirely, can I just index userTags
as a whole and run geospatial searches on the tags it contains?
If I understand correctly, you create a custom tag cloud.
I suggest you change the storage scheme to a more appropriate to given problem
collections:
update/insert tag for particular user will be:
if you don't have user this user will be created with the particular tag if you don't have tag this tag wiil be created or update count of tag for particular user, in tags array will inserting only unique tags
tags array for each user may be good indexed ;)
query after that will be simple:
or you can used aggregation framework for more complexity query, for example query with grouping by user