Which dump is fastest to import in open source virtuoso - n3/nt/xml?

418 views Asked by At

I am importing some RDF dumps into Virtuoso Open Source edition (6.x). I was wondering if there is a performance difference between importing different serializations of the same data (I have NT/N3/XML available).

While I'm at it : has anyone seen import performance differences when using Striping on a single disk ?

1

There are 1 answers

0
TallTed On BEST ANSWER

Questions specifically regarding Virtuoso are generally best raised on the public OpenLink Discussion Forums, the Virtuoso Users mailing list, or through a confidential Support Case.

That said, there will be some performance difference in imports, which differences will become more obvious as the size of the load increases.

RDF/XML will almost always be relatively slow -- because the entire XML tree must be parsed before any triples can be written to the graph store.

NT leaves out much of the syntactic sugar which must be handled in N3, so NT will probably be the fastest for loading. However, N3 files will typically be smaller (sometimes significantly so) than NT of the same data set, and this may be a significant consideration in some cases...

Bottom line -- this question is full of nuance, and there is no universally true answer.

Each post here should cover only one question, but I'll also say -- striping does not usually deliver much benefit on a single device (whether HDD or SSD). This feature delivers most performance benefits when splitting storage over multiple devices, each on its own controller, etc.