I have a file containing N-Quads (using the schema.org vocabulary) and I want to load it into a TDB RDF-store, using Apache Jena's command line tools. The command that I'm using is:
tdbloader --loc <rdf_store_location> <file_to_load>
But during the loading, I got an error:
[line: 769293, col: 154] Illegal unicode escape sequence value: \" (0x22)
I also ran the validation tool from Jena command line tools:
riot --validate <file_to_load>
and indeed, there are at least 30 errors/warnings similar to that:
Bad IRI
The path contains a segment /../ not at the beginning of a relative reference, or it contains a /./ These should be removed
Is there a way to ignore invalid N-Quads, or to delete them, by using the command line tools (Jena or if you have knowledge of other)?
Otherwise the only option would be to do a script to remove the invalid characters. But besides the file is huge (60 GB), I guess this is very prone to errors.