Jackson CSV Parser | Handle invalid values

1.1k views Asked by At

I am trying to parse a CSV file which is feeded from upstream.

The CSV file contains some Date and Number fields for which format is pre decided. there are few instances where value in field is not as expected there we want to read those values as null but Jackson-CSV parser throws exception.

Below is My Excecption

at com.fasterxml.jackson.databind.exc.InvalidFormatException.from(InvalidFormatException.java:67) at com.fasterxml.jackson.databind.DeserializationContext.weirdStringException(DeserializationContext.java:1535) at com.fasterxml.jackson.databind.DeserializationContext.handleWeirdStringValue(DeserializationContext.java:910) at com.fasterxml.jackson.databind.deser.std.StdDeserializer._parseDate(StdDeserializer.java:523) at com.fasterxml.jackson.databind.deser.std.StdDeserializer._parseDate(StdDeserializer.java:466) at com.fasterxml.jackson.databind.deser.std.DateDeserializers$DateBasedDeserializer._parseDate(DateDeserializers.java:195) at com.fasterxml.jackson.databind.deser.std.DateDeserializers$DateDeserializer.deserialize(DateDeserializers.java:285) at com.fasterxml.jackson.databind.deser.std.DateDeserializers$DateDeserializer.deserialize(DateDeserializers.java:268) at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:127) at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:287) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:151) at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277)

I have reported the same on Jackson CSVGithub page too. https://github.com/FasterXML/jackson-dataformat-csv/issues/153

1

There are 1 answers

2
Jeronimo Backes On

You could try univocity-parsers as it can handle multiple formats in your date/number fields. For example:

public class MyClass {
    @Format(formats = {"dd-MMM-yyyy", "yyyy-MM-dd"})
    @Parsed
    private Date date;

    @Format(formats = {"$###,###.###", ""#0.00""})
    @Parsed
    private BigDecimal amount;
}

Now if no formats are compatible with what is coming for a giving input, you can handle the error with this:

CsvParserSettings settings = new CsvParserSettings();
settings.detectFormatAutomatically(); //no need to configure format or anything.

parserSettings.setProcessorErrorHandler(new RetryableErrorHandler<ParsingContext>() {
    @Override
    public void handleError(DataProcessingException error, Object[] inputRow, ParsingContext context) {
        if(error.getColumnName().equals("date")){ 
            //if there's an error in the date column, assign a default and proceed with the record.
            setDefaultValue(new Date());
        } else { 
            //else keep the record anyway. Null will be used instead of the value you can't process.
            keepRecord(); //if you don't call keepRecord() the entire row is discarded.
        }
    }
});

Finally, you can parse your input with this:

List<MyClass> myClassList = new CsvRoutines(parserSettings).parseAll(MyClass.class, input);

Hope it helps.

Disclaimer: I'm the author of this library. It's open-source and free (Apache 2.0 license)