dc:Creator string literal vs. regex FILTER in SPARQL

562 views Asked by At

I am using Europeana's Virtuoso SPARQL Endpoint.

I have been trying to search in SPARQL for content about a specific contributor. To my understanding, this could be carried out this way:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title 
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator 'Picasso' .

}

Nevertheless, I get nothing in return.

Alternatively, I used FILTER regex to search for the literal.

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title ?creator
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator ?creator .
     FILTER regex(?creator, 'Picasso')
}

This actually worked very well and returned correctly the results.

My question is: Is it possible to produce the SPARQL query without using FILTER to search the work of a particular artist?

Many thanks.

2

There are 2 answers

4
Mark Miller On BEST ANSWER

I don't think there are any objects with 'Picasso' literally as the creator. So a regex filter is a good choice, but slow.

Here's a way to find the strings your regex is matching:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?creator, (count(?creator) as ?ccount)
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator ?creator .
     FILTER regex(?creator, 'Picasso')
}
group by ?creator
order by ?ccount

It might have been easier for you to see that if your had displayed all variables in the select statement:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator ?creator .
     FILTER regex(?creator, 'Picasso')
}

If you don't want to use a regex filter, you could enumerate all of the Picasso variants you are looking for:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
         values ?creator { "Picasso, Pablo" "Pablo Picasso" } .
         ?objectInfo dc:title ?title .
         ?objectInfo dc:creator ?creator
    }

bif:contains works on this endpoint and is pretty fast:

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT *
WHERE {
     ?objectInfo dc:title ?title .
     ?objectInfo dc:creator ?creator .
     ?creator bif:contains 'Picasso'
     #FILTER regex(?creator, 'Picasso')
}
0
UninformedUser On

1) Your first query has unconnected triple patterns.

2) I guess and according to the vocabulary description, dc:creator expects a resource, i.e. a URI. Using the URI of the entity Picasso doesn't work?

+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Term Name: creator |                                                                                                                                                |
| URI:               | http://purl.org/dc/elements/1.1/creator                                                                                                        |
| Label:             | Creator                                                                                                                                        |
| Definition:        | An entity primarily responsible for making the resource.                                                                                       |
| Comment:           | Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity. |
+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------+

It would good to see your data in order to decide whether FILTER on literals is necessary or not.