Remove '�' from different encoded file when reading in C#

206 views Asked by At

I can't control what encoding some of our clients save a file, and when it's ASCII the file may have missing characters that then show, '�'. How can I remove these characters, '�', after the file is read?

I am reading the file with the below line, but for each column would like to replace that character with a whitespace in C# .NET.

   using (var parser = new TextFieldParser("", Encoding.UTF8))
1

There are 1 answers

1
Ry- On BEST ANSWER

Looks like you can create a UTF-8 Encoding with a custom error replacement:

var encoding = Encoding.GetEncoding(
    "UTF-8",
    null,
    new DecoderReplacementFallback(string.Empty));

using (var parser = new TextFieldParser("", encoding)) {
    ⋮
}

I don’t know if the encoder fallback is allowed to be null. Replace it with new EncoderReplacementFallback(string.Empty) if not!