Using Jena to deserialize RDF that includes blank nodes results in unique IDs for those nodes each time the same RDF is deserialized. If identical RDF is deserialized multiple times and merged, the blank nodes become duplicated. Is there a way to avoid or remove the duplication?
static final String RDF =
"<http://www.foo.com/subject>" +
"<http://www.foo.com/predicate>" +
"[ a <http://www.foo.com/bar> , <http://www.foo.com/baz> ] .";
public static void main(String... args) {
Model m1 = ModelFactory.createDefaultModel().read(new StringReader(RDF), null, "ttl");
Model m2 = ModelFactory.createDefaultModel().read(new StringReader(RDF), null, "ttl");
Model m3 = m1.union(m2);
RDFDataMgr.write(System.out, m3, Lang.TURTLE);
}
//<http://www.foo.com/subject>
// <http://www.foo.com/predicate> [ a <http://www.foo.com/bar> , <http://www.foo.com/baz> ] ;
// <http://www.foo.com/predicate> [ a <http://www.foo.com/bar> , <http://www.foo.com/baz> ] .
This contrived example is a bit silly, but consider that I'm trying to merge RDF files that may or may not be identical.