How to keep quotes when parsing csv file?

5.5k views Asked by At

I am using Microsoft.VisualBasic.FileIO.TextFieldParser to read a csv file, edit it , then parse it.

The problem is the quotes are not being kept after parsing.

I tried using parser.HasFieldsEnclosedInQuotes = true; but it does not seem to keep the quotes for some reason.

This issue breaks when a field contains a quote for example : Before

 "some, field" 

After

 some, field 

As two seperate fields

Here is my method

public static void CleanStaffFile()
    {
        String path = @"C:\file.csv";
        String dpath = String.Format(@"C:\file_{0}.csv",DateTime.Now.ToString("MMddyyHHmmss"));
        List<String> lines = new List<String>();

        if (File.Exists(path))
        {
            using (TextFieldParser parser = new TextFieldParser(path))
            {
                parser.HasFieldsEnclosedInQuotes = true;
                parser.Delimiters = new string[] { "," };

                while (!parser.EndOfData)
                {
                    string[] parts = parser.ReadFields();

                    if (parts == null)
                    {
                        break;
                    }

                    if ((parts[12] != "") && (parts[12] != "*,116"))
                    {
                        parts[12] = parts[12].Substring(0, 3);
                    }
                    else
                    {
                        parts[12] = "0";
                    }

                    lines.Add(string.Join(",", parts));
                }
            }

            using (StreamWriter writer = new StreamWriter(dpath, false))
            {
                foreach (String line in lines)
                    writer.WriteLine(line);
            }

        }

        MessageBox.Show("CSV file successfully processed :\n");
    }
2

There are 2 answers

3
Tim Schmelter On BEST ANSWER

So you want to have quotes after you have modified it at string.Join(",", parts)? Then it's easy since only fields which contain the separator were wrapped in quotes before. Just add them again before the String.Join.

So before (and desired):

"some, field" 

after(not desired):

some, field 

This should work:

string[] fields = parser.ReadFields();
// insert your logic here ....
var newFields = fields 
    .Select(f => f.Contains(",") ? string.Format("\"{0}\"", f) : f);
lines.Add(string.Join(",", newFields));

Edit

I would like to keep quotes even if doesn't contain a comma

Then it's even easier:

var newFields = fields.Select(f => string.Format("\"{0}\"", f));
0
w5l On

The TextFieldParser.HasFieldsEnclosedInQuotes property is used as follows, from the MSDN page:

If the property is True, the parser assumes that fields are enclosed in quotation marks (" ") and may contain line endings.

If a field is enclosed in quotation marks, for example, abc, "field2a,field2b", field3 and this property is True, then all text enclosed in quotation marks will be returned as is; this example would return abc|field2a,field2b|field3. Setting this property to False would make this example return abc|"field2a|field2b"|field3.

The quotes will indicate the start and end of a field, which may then contain the character(s) used to normally separate fields. If your data itself has quotes, you need to set HasFieldsEnclosedInQuotes to false.

If your data fields can contain both separators and quotes, you will need to start escaping quotes before parsing, which is a problem. Basicly you're going beyond the capabilities of a simple CSV file.