Count resources having a property of a certain type in SPARQL

938 views Asked by At

I have data in a triple store and would like to compute the following:

how many resources 'x' have the object property 'op' with at least 2 different resources 'r' of similar type 'R' as values?

here is a example of such data in turtle syntax:

PREFIX ex: <>

  a ex:Document ;
  ex:mentions p1, p2, p3 .

  a ex:Document ;
  ex:mentions p4, p5 .

  a ex:Person ;
  ex:hasRole ex:r1 .

  a ex:Person ;
  ex:hasRole ex:r1 .

  a ex:Person ;
  ex:hasRole ex:r2 .

  a ex:Person ;
  ex:hasRole ex:r1 .

  a ex:Person ;
  ex:hasRole ex:r2 .

  a ex:Role1 . 

  a ex:Role2 .

The objective is to count resources such as ex:doc1 which has 2 ex:mentions having similar roles (r1 of type ex:Role1). Here the result will be 1, leaving aside ex:doc2`.

The strategy would be:

  1. identify resources having the desired property, i.e. documents (doc) having object properties (mentions) pointing on resources (person), these resources having themselves properties (hasRole) of similar values (the role)).

  2. count them.

I have difficulties with step 1. For example, this query returns all docs having a p1 with Role1, even if there is only one p (p1) having this property.

SELECT distinct ?doc
    ?doc a ex:Document .
    ?doc ex:mentions ?p1 .
    ?doc ex:mentions ?p2 .
    ?p1 ex:hasRole ?r1 .
    ?p2 ex:hasRole ?r1 .
    ?r1 a ex:Role1 .

Many thanks for your help.


There are 1 answers

Joshua Taylor On BEST ANSWER

Your data wasn't quite usable (there were no prefixes on the p1, p2, etc., resources), but after fixing that, I was able to use the following query. You were pretty close; the trick is that you need to filter(?p1 != ?p2) to ensure that you're getting different values of the ex:mentions property. Then you can just check that they have a role with a common type with ?p1 ex:hasRole/a ?roleType . ?p2 ex:hasRole/a ?roleType, or even more concisely, ?roleType ^(a/ex:hasRole) ?p1, ?p2. Then, in the counting, you only want to count distinct values of ?document, so you need (count(distinct ?document) as ?nDocuments):

prefix ex: <>

  #-- count ?document, but only count *distinct* values
  #-- of ?document.
  (count(distinct ?document) as ?nDocuments)

where {
  #-- get documents that have two distinct 
  #-- values for the ex:mentions property
  ?document a ex:Document ; ex:mentions ?p1, ?p2
  filter(?p1 != ?p2)

  #-- then check that they have a common role type
  ?roleType ^(a/ex:hasRole) ?p1, ?p2
| nDocuments |
| 2          |