Java / Android HTML custom tag parser

961 views Asked by derpyderp At 07 February 2015 at 01:28

I'm trying to figure out a way to parse a html file with custom tags in the form:

[custom tag="id"]

Here's an example of a file I'm working with:

<p>This is an <em>amazing</em> example. </p>
<p>Such amazement, <span>many wow.</span> </p>
<p>Oh look, a wild [custom tag="amaze"] appears.</p>
We need maor embeds <a href="http://youtu.be/F5nLu232KRo"> bro

What I would like (in an ideal world) is to get back is a list of elements):

List foundElements = [text, custom tag, text, link, text]

Where the element in the above list contains:

Text:

<p>This is an <em>amazing</em> example. </p>
<p>Such amazement, <span>many wow.</span> </p>
<p>Oh look, a wild [custom tag="amaze"] appears.</p>
We need maor embeds

Custom tag:

[custom tag="amaze"]

Link:

<a href="http://youtu.be/F5nLu232KRo">

Text:

 appears.</p>We need maor embeds

What I've tried:

Jsoup
Jsoup is great, it works perfectly for HTML. The issue is I can't define custom tags with opening "[" and closing "]". Correct me if I'm wrong?
Jericho
Again like Jsoup, Jericho works great..except for defining custom tags. You're required to use "<".
Java Regex
This is the option I really don't want to go for. It's not reliable and there's a lot of string manipulation that is brittle, especially when you're matching against a lot of regexes.

Last but not least, I'm looking for a performance orientated solution as this is done on an Android client.

All suggestions welcome!

Original Q&A

TechQA.

Java / Android HTML custom tag parser

There are 0 answers

Related Questions in JAVA

Related Questions in ANDROID

Related Questions in HTML-PARSING

Related Questions in JSOUP

Related Questions in JERICHO-HTML-PARSER

Popular Questions

Trending Questions