Bit match analogue for array of words (fingerprints)

210 views Asked by At

I'm trying to perform a substructure search on chemical database, using Avalon fingerprint precomputed for every compound. There is a method to compare these fingerprints in RDKit:

DataStructs.AllProbeBitsMatch ( fp1, fp2 )

Docs describe this method like this: "Returns True if all bits in the first argument match all bits in the vector defined by the pickle in the second argument".

They talk about Bit Vectors, but this fingerprint can also be computed "As Words" (array of integers, via GetAvalonFPAsWords method in RDKit, that I can store in MongoDB and hopefully perform search without RDKit, only using the power of database (which must be much faster).

So this is my question: I need some sort of operation for arrays, which is equivalent to AllProbeBitsMatch for bit vectors. Ideally this operation should be done on MongoDB, probably using aggregation features for better performance.

This is an article for RDKit and Avalon fingerprints I use for reference: http://rdkit.blogspot.com/2013/11/fingerprint-based-substructure.html

0

There are 0 answers