Get weighted shortest path in an Openlink Virtuoso Knowledge Graph

68 views Asked by At

Context

Given a certain KG with the following structure

<distanceTo:1> <weight>              3               .
<distanceTo:1> <weight>              4               .

<a>            <distance_with_weight> <distanceTo:1> .
<distanceTo:1> <goes_to>              <b>            .
<b>            <distance_with_weight> <distanceTo:2> .
<distanceTo:2> <goes_to>              <c>            .

I'd say that there is a path between a and c with a weight of 3 + 4 = 7.

Question

Is there a way to calculate weighted shortest path using SPARQL or SPARQL-BI?

I know there's the TRANSITIVE and T_SHORTEST clauses, but as far as I understand they only work with simple edges (with no weights).

I understand that I can enumerate all the paths and take the minimum sum of weights, but that'd be undesirable since, in this context, the SSSP Dijkstra algorithm could be way faster.

What I have so far

SELECT ?t, total_weight
WHERE {
  <a> <distance_with_weight>/<goes_to>* ?t.
  # here I need the sum of the weights involved so that I can get the minimum later.
}
1

There are 1 answers

0
Antoine Zimmermann On

I do not know SPARQL-BI, so maybe there is a solution to the general problem, but a pure SPARQL solution can work if your graph is a tree, that is, if there is a unique path that goes from one point to another. This solution is partly inspired by Joshua Taylor's answer to another question:

PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?destination (SUM(?weight) AS ?total_weight) WHERE { 
  <a> (<distance_with_weight>/<goes_to>)*/<distance_with_weight> ?d .
  ?d <weight> ?weight .
  ?d <goes_to>/(<distance_with_weight>/<goes_to>)* ?destination .
}
GROUP BY ?destination

When the graph is not a tree (or forest), there can be several paths between 2 points, so that the query will add all the weights of all the paths. Cycles are problematic too but not more than multiple paths. Finally, the data needs to be complete, with a weight assigned to all <distanceTo:i> nodes, and for each pair (<a>,<b>), there must be a distinct <distanceTo:i> node.