Cast JerichoHTML Parser Elements to String

109 views Asked by At

I am parsing through HTML data using the JerichoHTML parser (the getAllElements Method) which returns a List. However, I need to store the data in a String[]. However how I do it, (even by nested casting like (String) ((Object) theList)) it always fails.... any idea how this can be resolved? couldn't find any help regarding this on the jericho docs...

    public static String[] htmlParser(String htmlText){
    Source source = new Source(htmlText);
    List<Element> filteredList = source.getAllElements("p");
    String[] filteredArray = new String[filteredList.size()];
    for(int i = 0; i<filteredList.size();i++){
        filteredArray[i] =(String) ((Object) filteredList.get(i));
        }
    return filteredArray;
}

the error is the following:

Exception in thread "main" java.lang.ClassCastException: net.htmlparser.jericho.Element cannot be cast to java.lang.String
at InternalLinking.InputKeywordsLinksAlternative.htmlParser(InputKeywordsLinksAlternative.java:156)
at InternalLinking.InputKeywordsLinksAlternative.inputLinksCountLess150(InputKeywordsLinksAlternative.java:70)
at InternalLinking.InputKeywordsLinksAlternative.applyWordCountFilters(InputKeywordsLinksAlternative.java:61)
at InternalLinking.InputKeywordsLinksAlternative.main(InputKeywordsLinksAlternative.java:21)
1

There are 1 answers

1
Bernd Ebertz On

In java a cast will never change the type of an object, but rather you inform the compiler, that you know that type more precisly. This is not the case here. What you want is a conversion. There's no general way for conversions in java, but converting to String can be done via the toString()-method of the object or in a null save manner via Sring.valueOf()

for (int i = 0; i < filteredList.size(); i++) {
    filteredArray[i] = String.valueOf(filteredList.get(i));
}