What the point of using std::ios_base::binary?

12.8k views Asked by At

I had a issue with Linux file reading under Window. Here is the issue discussion: Using fstream::seekg under windows on a file created under Unix.

The issue was workarounded by opening the text file with std::ios_base::binary specified.

But what's the actual point with this mode? If specified, you can still work with your file as a text file (writting with mystream << "Hello World" << std::endl and reading with std::getline).

Under Windows, the only difference, I could notice is that mystream << "Hello World" << std::endl uses:

  • 0x0D 0x0A as line separator if std::ios_base::binary was not specified (EOL and carriage return)
  • 0x0A as line separator if std::ios_base::binary was specified (EOL only)

Notepad does not smartly show lines when opening the files generated with std::ios_base::binary. Better editors like vi or Wordpad does show them.

Is that really the only difference there is between files generated with and without std::ios_base::binary? Documentation says Consider stream as binary rather than text., what does this mean in the end?

Is it safe to always set std::ios_base::binary if I don't care about opeing the file in Notepad and want to have fstream::seekg always work?

3

There are 3 answers

3
James Kanze On BEST ANSWER

The differences between binary and text modes are implementation defined, but only concern the lowest level: they do not change the meaning of things like << and >> (which insert and extract textual data). Also, formally, outputting all but a few non-printable characters (like '\n') is undefined behavior if the file is in text mode.

For the most common OSs: under Unix, there is no distinction; both are identical. Under Windows, '\n' internally will be mapped to the two character sequence CR, LF (0x0D, 0x0A) externally, and 0x1A will be interpreted as an end of file when reading. In more exotic (and mostly extinct) OSs, however, they could be represented by entirely different file types at the OS level, and it could be impossible to read a file in text mode if it were written in binary mode, and vice versa. Or you could see something different: extra white space at the end of line, or no '\n' in binary mode.

With regards to always setting std::ios_base::binary: my policy for portable files is to decide exactly how I want them formatted, set binary, and output what I want. Which is often CR, LF, rather than just LF, since that's the network standard. On the other hand, most Windows programs have no problems with just LF, but I've encountered more than a few Unix programs which have problems with CR, LF; which argues for systematically using just LF (which is easier, too). Doing things this way means that I get the same results regardless of whether I'm running under Unix or under Windows.

1
Sebastian Redl On

The meaning of text stream vs binary stream is platform-specific and somewhat unpredictable.

But as far as popular platforms go, it's easy: On Linux and MacOS X, there is no difference. On Windows, the only difference is that internal \n is translated to \r\n in the external stream.

2
jpo38 On

I found (by loosing two hour of work trying to understand what was going on) a situation where specifying std::ios_base::binary does make a huge difference.

std::vector<char> data{ 0x01, 0x02, 0x0A, 0x0B };
{
    std::fstream tfat;
    tfat.open( "binary", std::ios_base::out | std::ios_base::binary );
    tfat.write( &(data[0]), data.size() );
    tfat.close();
}
{
    std::fstream tfat;
    tfat.open( "not_binary", std::ios_base::out );
    tfat.write( &(data[0]), data.size() );
    tfat.close();
}

Then, "binary" file contains 4 bytes: 0x01, 0x02, 0x0A, 0x0B But "not_binary" file contains 5 bytes: 0x01, 0x02, 0x0D, 0x0A, 0x0B

0x0D (\r) was inserted before 0x0A (\n). While I write 4 bytes, I expected to have 4 bytes in the file in the end.

So this make me realize why std::ios_base::binary must be used when writting data to a file, even if not using << operator.