I'm currently using TinkerPop Gremlin (with a Titan backend) to implement a "similar document" algorithm.
The next lines are working perfectly well in gremlin shell :
v = g.v(880068)
m=[:]
v.as('x').out('auto_tag').in('auto_tag').has('status', 1).except('x').groupCount(m).filter{false}
results=[]
m.sort{-it.value}[0..9].each(){key, value -> results.add(key.document_id) }
results
following results are visible :
==>3188749
==>3190640
==>3191407
==>3187753
==>3186634
==>3185534
==>3189883
==>3190108
==>3187088
==>3188890
But when I try to "wrap" the same code in a function, it doesn't work anymore :
v = g.v(880068)
def get_similar_documents(v) {
m=[:]
v.as('x').out('auto_tag').in('auto_tag').has('status', 1).except('x').groupCount(m).filter{false}
results=[]
m.sort{-it.value}[0..9].each(){key, value -> results.add(key.document_id) }
return results
}
get_similar_documents(v)
...nothing is returned
Coming from a Python backend, I assume this is related to variable scope but so far I don't understand how to fix it.
Thanks in advance for any help
Edit : I'm using Bulbs, that's why I'd like to wrap my code in a function (That I could later call from Python)
I think you need to
iterate
your pipeline when within theget_similar_documents
function. Meaning:It's important to remember that the Gremlin Shell automatically iterates pipelines for you. The shell isn't iterating it within the function so no side-effects are being generated to your
m
in thegroupCount
.You can read more about there here.