How to read a file that contains both ANSI and UTF-8 encoded characters

58 views Asked by At

I get a file from a third party. The file seems to contain both ANSI and UTF-8 encoded characters (not sure if my terminology is correct).

Changing the encoding in Notepad++ yields the following:

Notepad++ screenshot

So when using ANSI encoding, Employee2 is incorrect. And when using UTF-8 encoding, Employee1 is incorrect.

Is there a way in C# to set 2 encodings for a file?

Whichever encoding I set in C#, one of the two employees is incorrect:

string filetext = "";

// Employee1 is correct, Employee2 is wrong
filetext = File.ReadAllText(@"C:\TESTFILE.txt", Encoding.GetEncoding("ISO-8859-1"));   
filetext = File.ReadAllText(@"C:\TESTFILE.txt", Encoding.GetEncoding("Windows-1252")); 
filetext = File.ReadAllText(@"C:\TESTFILE.txt", Encoding.UTF7);                        
filetext = File.ReadAllText(@"C:\TESTFILE.txt", Encoding.Default);                     

// Employee1 is wrong, Employee2 is correct
filetext = File.ReadAllText(@"C:\TESTFILE.txt", Encoding.UTF8);

Has anyone else encountered this and found a solution?

0

There are 0 answers