I have a graph in TinkerPop3 in which the nodes are connected through edges with a given weight.
Given a node A in the graph I'm trying to traverse it to obtain all nodes which can be reached from A with less than N steps, and to sort them by a score. This score is calculated by mapping each edge in the path to that edge's weight divided by the step number and the total step, and finally summing the list.
Here is an example:
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> a = g.addV('name','a').next()
==>v[0]
gremlin> b = g.addV('name','b').next()
==>v[2]
gremlin> c = g.addV('name','c').next()
==>v[4]
gremlin> d = g.addV('name','d').next()
==>v[6]
gremlin> e = g.addV('name','e').next()
==>v[8]
gremlin> a.addEdge('conn', b, 'weight', 1)
==>e[10][0-conn->2]
gremlin> a.addEdge('conn', c, 'weight', 3)
==>e[11][0-conn->4]
gremlin> b.addEdge('conn', d, 'weight', 3)
==>e[12][2-conn->6]
gremlin> d.addEdge('conn', e, 'weight', 9)
==>e[14][6-conn->8]
The expected scores for nodes related to "a" would be:
score_n = sum_over_n(weight_n-1n / (depth_n * total_N_from_a_to_n))
score_b = w_ab / (1 * 1) = 1 / 1 = 1
score_c = w_ac / (1 * 1) = 3 / 1 = 3
score_d = (w_ab / (1 * 2)) + (w_bd / (2 * 2)) = 1/2 + 3/4 = 5/4
score_e = (w_ab / (1 * 3)) + (w_bd / (2 * 3)) + (w_de / (3 * 3)) = 1/3 + 3/6 + 9/9 = 11/6
And thus, the result should be the corresponding nodes ordered decreasingly:
c e d b
I'm new to TinkerPop and Gremlin and I've been trying a combination of the repeat and sack steps without luck. The plan until now has been to accumulate the step in a sack for each traversal, to emit the score and node, and finally sort by score, but I'm worried about potential performance issues associated to the process.
Not the easiest traversal to start with TinkerPop. First thing I've changed is the data type for weights.
This was necessary to get rid of rounding errors. It could also be done as part of the traversal, but I didn't want to add even more steps as it already became complex enough:
UPDATE
What follows is a working traversal for TP 3.0.1:
The only good thing about this traversal is, that the
weight
properties don't have to have another data type -int
works fine.