OpenCSV does not comply CSV standard (RFC 4180)

902 views Asked by At

I use openCSV to parse CSV file (separator is ';' & quote character is '"'), when parsing a wrong format likes below row:
column1;"column2";column""3
The result is an array of values: a[0] = column1, a[1] = column2, a[2] = column"3

But I think that's a wrong result because the input (in string: column""3) violates rule 5 of RFC 4180 (https://www.rfc-editor.org/rfc/rfc4180):
Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields..

Does anyone know how to detect this violation in openCSV?

1

There are 1 answers

1
Scott Conway On

OpenCSV merely parses the file/strings it does no validation. Based on the parameters as long as it can parse the strings it throws no errors. It makes the basic assumption that the string is valid.

Are you using the 3.9 version of opencsv with the RFC4180Parser? That should give you a different answer :)