I am trying to use jTidy for extract data from (real world)HTML.But jTidy doesnt parse custom tags.
<html>
<body>
<myCustomTag>some text</myCustomTag>
<anotherCustom>more text</anotherCustom>
</body>
</html>
I cant get texts between custom tags.I have to use jTidy because i ll use xpath.
I tried HTMLCleaner but it doesnt support full xpath functions.
You can also set the properties using a Java Properties object, for example:
This should save you having to create and load a configuration file.