Why am i getting these invalid characters before my file data?

1k views Asked by At

enter image description here

I am trying to read a file into a string either by getline function or fileContents.assign( (istreambuf_iterator<char>(myFile)), (istreambuf_iterator<char>())); Either of the way gives me the above output which shown in the image.

First way:

 string fileContents;
 ifstream myFile("textFile.txt");
 while(getline(myFile,fileContents))
 cout<<fileContents<<endl;

Alternate way:

 string fileContents;
 ifstream myFile(fileName.c_str());
 if (myFile.is_open())
  {
    fileContents.assign( (istreambuf_iterator<char>(myFile) ),
                       (istreambuf_iterator<char>()    ) );
    cout<<fileContents;
  }
2

There are 2 answers

7
TheCppZoo On

The file begins with those characters, most likely a BOM to tell you what the encoding of the file is.

You probably are not able to see them in Windows Notepad because Notepad hides the encoding bytes. Get a decent text editor that lets you see the binary of the file and you will see those characters.

0
Remy Lebeau On

Your file starts with a UTF-8 BOM (bytes 0xEF 0xBB 0xBF). You are reading the file's raw bytes as-is and outputting them to a display that is using an OEM font for codepage 437. To handle text files properly, especially Unicode-encoded text files, you need to read the first few bytes, check for a BOM (and there are several you can look for), and if detected then seek past the BOM and interpret the remaining bytes of the file in the specified encoding, in this case UTF-8.