I am parsing.csv file having two columns. I am trying to parse row using boost tokenizer from csv file in which one of field in row is in double quote(Ex: 1,"test"). After tokenizer, I am getting field without double quote in tok (1,test).
typedef tokenizer< escaped_list_separator<char>> Tokenizer;
if (getline(inputFile, line))
{
Tokenizer tok(line);
vector< string > vec;
vec.assign(tok.begin(), tok.end());
//Here *(vec.begin() + 1) is printing string- test , without double quote
}
Is there any way to get this second field with double quote?
The quotes are a presentation thing. Once you parse/tokenize the data, you want the unescaped data back.
The quoted/escaped representation is to protect special characters in your data in transit only (to prevent them from interfering with your protocol¹).
Once you read it back, it is no longer in transit, and to "keep" the escapes or quotes (or whatever other artefacts come with your protocol¹) would be an error, and in fact is a frequent source of bugs, not seldom security vulnerabilities
Samples
aor"a"corresponds to a value ofa"\""corresponds to""\\\""corresponds to\""\"is incomplete (the quoted construct is not closed)The important thing is that your values roundtrip without loss of information. So, parsing
"a"as the value"a"creates the conceptual error that converting it back to quoted-escaped format would suddenly look like"\"a\"", which is an entirely different thing!¹ presentation format or transport protocol
² most commonly, code injection: