How can I search in arrays of integers with a compound MongoDB Atlas search query?

966 views Asked by At

I am working on a function that helps me find similar documents, sorted by score, using the full-text search feature of MongoDB Atlas.

I set my collection index as "dynamic".

I am looking for similarities in text fields, such as "name" or "description", but I also want to look in another field, "thematic", that stores integer values (ids) of thematics.


Example:

Let say that I have a reference document as follows:

{
 name: "test",
 description: "It's a glorious day!",
 thematic: [9, 3, 2, 33]
}

I want my search to match these int in the thematic field and include their weight in the score calculation.

For instance, if I compare my reference document with :

{
 name: "test2",
 description: "It's a glorious night!",
 thematic: [9, 3, 6, 22]
}

I want to increase the score since the thematic field shares the 9 and 3 values with the reference document.


Question:

What search operator should I use to achieve this? I can input array of strings as queries with a text operator but I don't know how to proceed with integers.

Should I go for another approach? Like splitting the array to compare into several compound.should.term queries?


Edit:

After a fair amount of search, I found this here and here:

Atlas Search cannot index numeric or date values if they are part of an array.

Before I consider to change the whole data structure of my objects, I wanted to make sure that there is no workaround.

For instance, could it be done with custom analyzers?

1

There are 1 answers

0
Billybobbonnet On BEST ANSWER

I solved it by adding a trigger to my collection. Each time a document is inserted or updated, I update the thematic and other similar fields counterparts, e.g. _thematic, where I store the string value of the integers. I then use this _thematic field for search.

Here is a sample code demonstrating it:

exports = function (changeEvent) {

const fullDocument = changeEvent.fullDocument;
const format = (itemSet) => {
    let rst = [];
    Object.keys(itemSet).forEach(item => rst.push(itemSet[item].toString()));
    return rst;
};
let setter = {      
    _thematic: fullDocument.thematic ? format(fullDocument.thematic) : [],      
};
const docId = changeEvent.documentKey._id;

const collection = context.services.get("my-cluster").db("dev").collection("projects");

const doc = collection.findOneAndUpdate({ _id: docId },
    { $set: setter });

return;
};

I'm pretty sure it can be done in a cleaner way, so if someone post it, I'll switch the selected answer to her/his.

Another way to solve this is to make a custom analyser with character mapping that will replace each digit with its string counterpart. I haven’t tried this one tho. See https://docs.atlas.mongodb.com/reference/atlas-search/analyzers/custom/#mapping

Alternatives welcome!