wide strings with accents are not outputted

80 views Asked by At

When I insert a string with accents, it doesn't show up in the file "FAKE.txt" (UTF-16 encoding)

std::wifstream ifFake("FAKE.txt", std::ios::binary);
      ifFake.imbue(std::locale(ifFake.getloc(),
         new std::codecvt_utf16<wchar_t, 0x10ffff, std::consume_header>));
      if (!ifFake)
      {
         std::wofstream ofFake("FAKE.txt", std::ios::binary);
         ofFake << L"toc" << std::endl;
         ofFake << L"salut" << std::endl;
         ofFake << L"autre" << std::endl;
         ofFake << L"êtres" << std::endl;
         ofFake << L"âpres" << std::endl;
         ofFake << L"bêtes" << std::endl;
      }

Result (FAKE.txt) toc salut autre

The rest of the accented words are not written (stream error I guess).

The program was compiled with g++ and source file encoding is UTF-8.

I noticed the same behavior with console output.

How can I fix that ?

1

There are 1 answers

1
Danh On BEST ANSWER

Because you didn't imbue the locale for ofFake.

below code should work well:

  std::wofstream ofFake("FAKE.txt", std::ios::binary);
  ofFake.imbue(std::locale(ofFake.getloc(),
               new std::codecvt_utf16<wchar_t, 0x10ffff, std::generate_header>));
  ofFake << std::wstring(L"toc") << std::endl;
  ofFake << L"salut" << std::endl;
  ofFake << L"autre" << std::endl;
  ofFake << L"êtres" << std::endl;
  ofFake << L"âpres" << std::endl;
  ofFake << L"bêtes" << std::endl;

Although, only MSVC++ binary will make UTF-16 encoded file. g++ binary seems like make a UTF8 encoded file with some useless BOM.

Hence, I recommend use utf8 instead:

  std::wofstream ofFake("FAKE.txt", std::ios::binary);
  ofFake.imbue(std::locale(ofFake.getloc(), new std::codecvt_utf8<wchar_t>));
  ofFake << L"toc" << std::endl;
  ofFake << L"salut" << std::endl;
  ofFake << L"autre" << std::endl;
  ofFake << L"êtres" << std::endl;
  ofFake << L"âpres" << std::endl;
  ofFake << L"bêtes" << std::endl;