SHACL SPARQL targets not giving correct inference while using pySHACL

302 views Asked by At

When I try to do SPARQL based SHACL validation, I am getting the wrong results. I am trying to filter out processes where cranecapacity is less than module weight using SHACL SPARQL target.

    from pyshacl import validate
shapes_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

Testsparql:PrefixDeclaration
  rdf:type sh:PrefixDeclaration ;
  sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
  sh:prefix "Testsparql" ;
.

Testsparql:Processshape
  rdf:type rdfs:Class ;
  rdf:type sh:NodeShape ;
  rdfs:subClassOf owl:Class ;
  sh:sparql [
      sh:message "Invalid process" ;
      sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
      sh:select """SELECT $this 
        WHERE {
             $this  rdf:type Testsparql:Process.
            $this Testsparql:hasResource ?crane.
            $this Testsparql:hasAssociation ?module.
            ?crane Testsparql:Cranecapacity ?cc.
            ?module Testsparql:Moduleweight ?mw.
                    FILTER (?cc <= ?mw).

     }""" ;
    ] ;
.

'''
shapes_file_format = 'turtle'

data_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://semanticprocess.x10host.com/Ontology/Testsparql>
  rdf:type owl:Ontology ;
  owl:imports <http://datashapes.org/dash> ;
  owl:versionInfo "Created with TopBraid Composer" ;
  sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Crane ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
  rdf:type Testsparql:Module ;
  Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Module ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;

.
Testsparql:Process
  rdf:type rdfs:Class ;

  rdfs:subClassOf owl:Class ;
  .
Testsparql:ProcessID
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range xsd:string ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P1" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P2" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Module ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Crane ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.

'''
data_file_format = 'turtle'

conforms, v_graph, v_text = validate(data_file, shacl_graph=shapes_file,
                                     target_graph_format=data_file_format,
                                     shacl_graph_format=shapes_file_format,
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)

I am getting the result

True
b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n    sh:conforms true .\n\n'
Validation Report
Conforms: True 

However, if the same file is written as a single rdf file, it gives me false which is the right answer

from pyshacl import validate
data_file = '''
# baseURI: http://semanticprocess.x10host.com/Ontology/Testsparql
# imports: http://datashapes.org/dash
# prefix: Testsparql

@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://semanticprocess.x10host.com/Ontology/Testsparql>
  rdf:type owl:Ontology ;
  owl:imports <http://datashapes.org/dash> ;
  owl:versionInfo "Created with TopBraid Composer" ;
  sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
  rdf:type Testsparql:Crane ;
  Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Crane ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
  rdf:type rdfs:Class ;
  rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
  rdf:type Testsparql:Module ;
  Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Module ;
  rdfs:range xsd:decimal ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:PrefixDeclaration
  rdf:type sh:PrefixDeclaration ;
  sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
  sh:prefix "Testsparql" ;
.
Testsparql:Process
  rdf:type rdfs:Class ;
  rdf:type sh:NodeShape ;
  rdfs:subClassOf owl:Class ;
  sh:sparql [
      sh:message "Invalid process" ;
      sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
      sh:select """SELECT $this 
        WHERE {
             $this  rdf:type Testsparql:Process.
            $this Testsparql:hasResource ?crane.
            $this Testsparql:hasAssociation ?module.
            ?crane Testsparql:Cranecapacity ?cc.
            ?module Testsparql:Moduleweight ?mw.
                    FILTER (?cc <= ?mw).

     }""" ;
    ] ;
.
Testsparql:ProcessID
  rdf:type owl:DatatypeProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range xsd:string ;
  rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P1" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
  rdf:type Testsparql:Process ;
  Testsparql:ProcessID "P2" ;
  Testsparql:hasAssociation Testsparql:Module_1 ;
  Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Module ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
  rdf:type owl:ObjectProperty ;
  rdfs:domain Testsparql:Process ;
  rdfs:range Testsparql:Crane ;
  rdfs:subPropertyOf owl:topObjectProperty ;
.
'''
data_file_format = 'turtle'

conforms, v_graph, v_text = validate(data_file, shacl_graph=None,
                                     target_graph_format=data_file_format,
                                     shacl_graph_format=shapes_file_format,
                                     inference='rdfs', debug=True,
                                     serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)

It gives the correct inference.

  False
    b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n    sh:conforms false ;\n    sh:result [ a sh:ValidationResult ;\n            sh:focusNode Testsparql:Process_1 ;\n            sh:resultMessage "Invalid process" ;\n            sh:resultSeverity sh:Violation ;\n            sh:sourceConstraint [ sh:message "Invalid process" ;\n                    sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;\n                    sh:select """SELECT $this \n        WHERE {\n\t\t\t $this  rdf:type Testsparql:Process.\n\t\t\t$this Testsparql:hasResource ?crane.\n\t\t\t$this Testsparql:hasAssociation ?module.\n\t\t\t?crane Testsparql:Cranecapacity ?cc.\n\t\t\t?module Testsparql:Moduleweight ?mw.\n\t\t\t\t\tFILTER (?cc <= ?mw).\n\n     }""" ] ;\n            sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;\n            sh:sourceShape Testsparql:Process ;\n            sh:value Testsparql:Process_1 ] .\n\n'
    Validation Report
    Conforms: False
    Results (1):
    Constraint Violation in SPARQLConstraintComponent (http://www.w3.org/ns/shacl#SPARQLConstraintComponent):
        Severity: sh:Violation
        Source Shape: Testsparql:Process
        Focus Node: Testsparql:Process_1
        Value Node: Testsparql:Process_1
        Source Constraint: [ sh:message Literal("Invalid process") ; sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ; sh:select Literal("SELECT $this 
            WHERE {
                 $this  rdf:type Testsparql:Process.
                $this Testsparql:hasResource ?crane.
                $this Testsparql:hasAssociation ?module.
                ?crane Testsparql:Cranecapacity ?cc.
                ?module Testsparql:Moduleweight ?mw.
                        FILTER (?cc <= ?mw).

         }") ]
        Message: Invalid process

What is wrong in my implementation?

1

There are 1 answers

0
Nicholas Car On

In pySHACL, if there's an error in your SHACL file, it will often validate True since the validation tests cannot be run. Try the -m command line arg to "meta validate", that is, to validate the SHACL shape you're using first, before using that shape to validate data.

I always prefer keeping my Shapes and Data graph separate, this always helps with finding problems.