How to configure nekohtml parser to properly close the anchor tag?

694 views Asked by At

I'm using the nekohtml parser to parse my html code. Sometime my mistake while using anchor tag the content has been written like this.

<a href="http://abc.com"><a href="http://abc.com">abc</a></a>

After parsing throough the nekohtml i want the content to corrected like this.

<a href="http://abc.com"></a><a href="http://abc.com">abc</a>

For this to achieve please help to configure the nekohtml parsing.

Update:

After i tried with settings as

parser.setFeature( "http://cyberneko.org/html/features/balance-tags", true );

it is of no use. i doesn't give the result as i expected. it returns the same html content as i given

1

There are 1 answers

1
tolitius On

Need to set a balance-tags feature that specifies if the NekoHTML parser should attempt to balance the tags in the parsed document.

config.setFeature( "http://cyberneko.org/html/features/balance-tags", true );

from the docs:

  • Balancing the tags fixes up many common mistakes by adding missing parent elements, automatically closing elements with optional end tags, and correcting unbalanced inline element tags. In order to process HTML documents as XML, this feature should not be turned off. This feature is provided as a performance enhancement for applications that only care about the appearance of specific elements, attributes, and/or content regardless of the document's ill-formed structure.