How do I parse the following format string?

225 views Asked by At

I have data in following format

<foo bar> <property abc> <this foo bar> .

Now there are essentially 4 parts in this string: foo bar; property abc; this foo bar; and .. How do I tokenize the above string into these four parts?

2

There are 2 answers

0
Brinnis On BEST ANSWER
String[] array = string.split("> ");

for (int i = 0; i < array.length -1; i++){
    System.out.println(array[i] + ">");
}
System.out.println(array[array.length-1]);
0
RobV On

As others have suggested if you want to parse RDF graphs just use a library like Apache Jena (disclaimer - I am one of the developers).

If your problem is more that you need direct control over the parsing process then there are several options:

  • Jena has a TokenizerText class which can tokenize NTriple/Turtle/SPARQL like data if you want to work with the data at the textual level
  • You can implement StreamRDF interface and use this with the built-in parsers to control what happens to the data as it is parsed at the triple/quad level