File.WriteAllText is inserting a space after every character

2.9k views Asked by At

File.WriteAllText is inserting a space after every letter and quotation.

Example:

Original File

"JobID" "ParentJobID"

New File

" J o b I D "    " P a r e n t J o b I D "

CODE

using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ProcessOutputLogTransfer
{
    class Program
    {
        static void Main(string[] args)
        {

            string content = File.ReadAllText(@"C:\Documents and Settings\All Users\Application Data\Microsoft\Windows NT\MSFax\ActivityLog\OutboxLOG.txt");

        File.WriteAllText(@"C:\FAXLOG\OutboxLOG.txt", content, Encoding.UTF8);
        }
    }
}
6

There are 6 answers

0
Jon Skeet On BEST ANSWER

I don't think it's WriteAllText that's doing this. I believe it's ReadAllText, which defaults to reading using UTF-8 - I suspect your OutboxLOG.txt file is actually written in UTF-16, instead. Try this:

string inputPath = @"C:\Documents and Settings\All Users\Application Data\"
                 + @"Microsoft\Windows NT\MSFax\ActivityLog\OutboxLOG.txt";
string outputPath = @"C:\FAXLOG\OutboxLOG.txt";

string content = File.ReadAllText(inputPath, Encoding.Unicode);
File.WriteAllText(outputPath, content, Encoding.UTF8);
0
Philippe Leybaert On

The original file is probably encoded in Unicode (16 bit)

Try reading it like this:

  File.ReadAllText(@"C:\Documents and Settings\All Users\Application Data\Microsoft\Windows NT\MSFax\ActivityLog\OutboxLOG.txt",Encoding.Unicode);
0
djdanlib On

If you're just copying a file, use File.Copy instead.

That being said, this sounds like an encoding issue. Try using the File.ReadAllText method overload that includes the second argument, which specifies encoding. Make sure you're using the same encoding all the way through your process.

0
Jon On

File.WriteAllText is certainly not so egregiously buggy; if it were, people would have already noticed.

The immediate problem here is that ReadAllText does not correctly detect the encoding of your input file. This method is documented to detect encodings based on the presence of BOMs, and the documentation says that encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.

The underlying issue is that you cannot simply treat files as "text" today, and detection is not very reliable and does not always work; for guaranteed results you also need to know the encoding used. Call the other overload of ReadAllText, specifying the correct encoding parameter, and the problem will be solved.

0
MethodMan On

Why not use ReadAllLines would that work for you instead of read all text

0
dgvid On

Try this:

string content = File.ReadAllText(@"C:\Documents and Settings\All Users\Application Data\Microsoft\Windows NT\MSFax\ActivityLog\OutboxLOG.txt",
                                  System.Text.Encoding.Unicode);