Get evaluated cell value of Excel with SAX-Parser of Apache POI

1.3k views Asked by At

I read an excel-sheet with the EventUserModel of Apache POI which operates with a SAX-Parser.

The problem is I have some normal Cells with content like "hello" and Cells with content like this: =IF(SUM(P254;R254;N254)=0;V254;VLOOKUP(Z254;Cirteria!$Y$2:$Z$4;2;TRUE))

I don't want these formulas but their evaluated value e.g. "hello". I know it's saved in the XML but I don't know how to access it.

<Cell ss:StyleID="s168"
    ss:Formula="=IF(SUM(RC[-11],RC[-9],RC[-13])=0,RC[-5],VLOOKUP(RC[-1],Kriterien!R2C25:R4C26,2,TRUE))">
    <Data ss:Type="String">hello</Data>
</Cell>
<Cell ss:StyleID="s167"><Data ss:Type="String">hello</Data></Cell>

It seems as if there is a "formula" tag SAX is always returning this value instead of the Data value. Otherwise the normal "hello" is returned.

My Code looks like this:

public void startElement(String uri, String localName, String name,
        Attributes attributes) throws SAXException {
    if (name.equals("c")) {
        String cellType = attributes.getValue("t");
        if (cellType != null && cellType.equals("s")) {
            nextIsString = true;
        } else {
            nextIsString = false;
        }
    }
    lastContents = "";
}
public void characters(char[] ch, int start, int length)
        throws SAXException {
    lastContents = new String(ch, start, length);
}
public void endElement(String uri, String localName, String name)
        throws SAXException {
    if (nextIsString) {
        int idx = Integer.parseInt(lastContents);
        lastContents = new XSSFRichTextString(sst.getEntryAt(idx))
                .toString();
        System.out.println(lastContents);
        nextIsString = false;
    }
}

I get all the data I need except the data generated by formulas.

1

There are 1 answers

0
Jacob On

This is an old thread, but I was trying to figure out a similar problem, and thought I would post this in hopes it will help someone in the future. The tag structure for a formula within the cell is different than your typical cell / value structure.

Therefore you have to be cognizant of when your formula tag opens and closes within the cell, as well as when the value opens and closes within the cell. The tag structures is similar to (of course without attributes and only one cell in the row for simplicity) ...

`<row> <c> <f> IF(SUM(RC[-11],RC[-9],RC[-13])=0,RC[-5],VLOOKUP(RC[-1],Kriterien!R2C25:R4C26,2,TRUE))</f> <v> hello </v> </c> </row> `

Note that the characters(...) method will store the value of the formula, then the concluded value in between two distinct tags.

So, when evaluating the assigned value, you have to be cognizant of the formula tag's impact on your value.