How to filter down a large Jena Model in TDB

159 views Asked by At

I have a large RDF model that doesn't fit in memory. I am currently loading the entire thing into TDB, but I would like to instead filter it down by focusing on only a subgraph (all properties about all resources which are subclassof or type of some "root" concept).

What I have tried is to execute a DESCRIBE statement against the full TDB model which describes the subset of the graph I am interested in ({ ?x rdf:type/rdfs:subClassOf* ?type }). The problem I have is twofold:

  1. On a smaller [sample] dataset, the DESCRIBE statement completes, but I can't figure out how to write the resulting Model back into the TDB (I want to throw away all the other data). I tried to call tdbModel.setDefaultModel() but it throws exception. So, what I am doing now is to create a second TDB location, get the default model, and then add the result of the DESCRIBE statement into this other model. Is there a better way?

  2. On the full dataset, I think the DESCRIBE statement would result in over 500k triples and its been running for a couple hours without completion. Is there a more efficient way to do this?

0

There are 0 answers