Protege 4 - Saving RDF reformats nested blank nodes

1.2k views Asked by At

I just switched from TopBraid to try out Protege.

I have an ontology with some RDF that resembles this:

instances:some_thing1 a semapi:SomeClass ;
                               semapi:hasChainTo (
                                      [ 
                                            a semapi:SomeOtherClass ;
                                            semapi:hasChainTo (
                                                 [ ... ]
                                                 [ ... ]
                                            )
                                      ] 
                              ) .

The idea is that this nested blank nodes syntax works great because the chains get very deep and this syntax is fluid and highly readable and maintainable as the chains may change from time to time and new chains can be added.

Not only that, but I have already wrote queries for the resulting graph.

Problem is, if I import this into Protege and then Save it right back out, the result are reformatted to something like:

   instance:some_thing1 rdf:type semapi:SomeClass ,
                               owl:NamedIndividual ;
                               semapi:hasChainTo [ ] .


   [ rdf:type semapi:SomeClass ;
       semapi:hasChainTo [ ]
   ] .

The resulting RDF completely breaks the querying system as well as the other benefits of using this approach to represent "chaining".

Is there any way I can get around this? If not I may be forced to switch back to TopBraid.

UPDATE: Here is a reproduction of the issue:

I wrote bugTest.ttl then open it in Protege and immediately Save As > Turtle > bugTestOutput.ttl:

https://dl.dropboxusercontent.com/u/13814624/bugTest.ttl https://dl.dropboxusercontent.com/u/13814624/bugTestOutput.ttl

1

There are 1 answers

9
Joshua Taylor On BEST ANSWER

In short, your ontology is not a valid OWL ontology, and Protégé is following the “garbage in, garbage out” principle. Since the some bad data is coming in (though Protégé does try to salvage it), you get bad data out (actually, just the salvaged data). You can validate an ontology with the Manchester OWL Validator, but you'll need to select the OWL 2 DL profile to get the appropriate diagnostics. On your document, the output is:

The ontology and/or one of its imports is NOT in the OWL 2 DL profile

Imports Closure

Ontology IRI                                         Physical URI
OntologyID(OntologyIRI(<http://ideation.io/semapi>))

Detailed report

Use of reserved vocabulary for class IRI

SubClassOf(semapi:BaseClass rdfs:Class)

Use of undeclared class

SubClassOf(semapi:BaseClass rdfs:Class)

Aside from the fact that you have a triple:

<http://ideation.io/semapi>
      a       owl:Ontology .

in the first file, this doesn't appear to be an OWL ontology at all. E.g.,

semapi:BaseClass a rdfs:Class; 
                 rdfs:subClassOf rdfs:Class .

is defining some classes that could be used in an RDFS vocabulary, but it doesn't declare any owl:Classes. When you do something like

semapi:hasChainTo a owl:ObjectProperty; 
                  rdfs:domain semapi:BaseClass;
                  rdfs:range  semapi:BaseClass .

You've got an owl:ObjectProperty that's going to be relating semapi:BaseClasses, each of which is also an rdfs:Class, so you've got an object property that's going to be relating rdfs:Classes, but in OWL DL, object properties can only relate individuals. Where you start using RDF lists, i.e., in:

instances:Instance1 a semapi:DerivedClass;
                        semapi:hasChainTo (
                            [
                                a semapi:DerivedClass;
                                semapi:hasChainTo (
...

you're using an RDF list as the object in an object property assertion. RDF lists can't be used in OWL DL, however, because they're also used in the RDF serialization of OWL. It would seem, then, that Protégé is discarding a bunch of information that isn't meaningful to it as the RDF serialization of an OWL ontology. One might be able to argue that when Protégé doesn't know what do with some RDF that's coming in, that it should preserve it, but that's really an untenable position when RDF is just one possible serialization of the serialized thing (an OWL ontology) that Protégé is concerned with.

Pellet's lint tool produces a number of warnings:

[Untyped classes]
- http://ideation.io/semapi#DerivedClass
- http://ideation.io/semapi#BaseClass
- http://www.w3.org/2000/01/rdf-schema#Class

[Untyped individuals]
- 6 BNode(s)

[Using rdfs:Class instead of owl:Class]
- http://ideation.io/semapi#DerivedClass
- http://ideation.io/semapi#BaseClass



=========================================================
OWL 2 DL violations found for ontology <http://ideation.io/semapi>:
Use of undeclared class: <http://ideation.io/semapi#BaseClass> [ObjectPropertyRange(<http://ideation.io/semapi#hasChainTo> <http://ideation.io/semapi#BaseClass>) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid5) in <http://ideation.io/semapi>]
Use of undeclared class: rdfs:Class [SubClassOf(<http://ideation.io/semapi#BaseClass> rdfs:Class) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid11) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#BaseClass> [SubClassOf(<http://ideation.io/semapi#DerivedClass> <http://ideation.io/semapi#BaseClass>) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid9) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#BaseClass> [SubClassOf(<http://ideation.io/semapi#BaseClass> rdfs:Class) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid1) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#BaseClass> [ObjectPropertyDomain(<http://ideation.io/semapi#hasChainTo> <http://ideation.io/semapi#BaseClass>) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid7) in <http://ideation.io/semapi>]
Use of reserved vocabulary for class IRI: rdfs:Class [SubClassOf(<http://ideation.io/semapi#BaseClass> rdfs:Class) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> _:genid3) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [SubClassOf(<http://ideation.io/semapi#DerivedClass> <http://ideation.io/semapi#BaseClass>) in <http://ideation.io/semapi>]
Use of undeclared class: <http://ideation.io/semapi#DerivedClass> [ClassAssertion(<http://ideation.io/semapi#DerivedClass> <http://ideation.io/instances#Instance1>) in <http://ideation.io/semapi>]


No OWL lints found for ontology <http://ideation.io/semapi>.

<http://ideation.io/semapi> does not import other ontologies.