I've been implementing a codecvt for handling indentiation of output streams. It can be used like this and works fine:
std::cout << indenter::push << "im indentet" << indenter::pop << "\n im not..."
However, while I can imbue an std::codecvt
to any std::ostream
I was very confused when I found out that my code worked with std::cout
as well as std::ofstream
, but not for example for std::ostringstream
even while all of which inherit from the base class std::ostream
.
The facet is constructed normally, the code compiles, it doesn't throw any exceptions... It's just that none of the member functions of the std::codecvt
are called.
For me that is very confusing and I had to spend a lot of time figuring out that std::codecvt
won't do anything on non file I/O streams.
Is there any reason std::codecvt
is not being used by all classes inherited by std::ostream
?
Furthermore does anyone have an idea on which structs I could fall back on to implement the indenter?
Edit: this is the part of the language I'm referring to:
All file I/O operations performed through std::basic_fstream use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.
Source: https://en.cppreference.com/w/cpp/locale/codecvt
Update 1:
I've made a small example illustrating my problem:
#include <iostream>
#include <locale>
#include <fstream>
#include <sstream>
static auto invocation_counter = 0u;
struct custom_facet : std::codecvt<char, char, std::mbstate_t>
{
using parent_t = std::codecvt<char, char, std::mbstate_t>;
custom_facet() : parent_t(std::size_t { 0u }) {}
using parent_t::intern_type;
using parent_t::extern_type;
using parent_t::state_type;
virtual std::codecvt_base::result do_out (state_type& state, const intern_type* from, const intern_type* from_end, const intern_type*& from_next,
extern_type* to, extern_type* to_end, extern_type*& to_next) const override
{
while (from < from_end && to < to_end)
{
*to = *from;
to++;
from++;
}
invocation_counter++;
from_next = from;
to_next = to;
return std::codecvt_base::noconv;
}
virtual bool do_always_noconv() const throw() override
{
return false;
}
};
std::ostream& imbueFacet (std::ostream& ostream)
{
ostream.imbue(std::locale { ostream.getloc(), new custom_facet{} });
return ostream;
}
int main()
{
std::ios::sync_with_stdio(false);
std::cout << "invocation_counter = " << invocation_counter << "\n";
{
auto ofstream = std::ofstream { "testFile.txt" };
ofstream << imbueFacet << "test\n";
}
std::cout << "invocation_counter = " << invocation_counter << "\n";
{
auto osstream = std::ostringstream {};
osstream << imbueFacet << "test\n";
}
std::cout << "invocation_counter = " << invocation_counter << "\n";
}
I would except invocation_counter
to increase after streaming in the std::ostringstream
, but it doesn't.
Update 2:
After more research I found out that I could use std::wbuffer_converter
. To quote https://en.cppreference.com/w/cpp/locale/wbuffer_convert
std::wbuffer_convert
is a wrapper over stream buffer of typestd::basic_streambuf<char>
which gives it the appearance ofstd::basic_streambuf<Elem>
. All I/O performed throughstd::wbuffer_convert
undergoes character conversion as defined by the facet Codecvt. [...]This class template makes the implicit character conversion functionality of
std::basic_filebuf
available for anystd::basic_streambuf
.
This way I can apply a facet to a std::ostringstream
:
auto osstream = std::ostringstream {};
osstream << "test\n";
auto facet = custom_facet{};
std::wstring_convert<custom_facet, char> conv;
auto str = conv.to_bytes(osstream.str());
However, I lose the ability to concate facets using the streaming operator <<
.
This confuses me even more why the std::codecvt
is not implicity used by ALL output streams. All output streams inherit from std::basic_streambuf
whose interface is suitable to using std::codecvt
, which is just using an input and an output character sequence, fully implemented in std::basic_streambuf
.
So why is the parsing of std::codecvt
implemented in std::basic_filebuf
instead of std::basic_streambuf
? std::basic_filebuf
inherits std::basic_streambuf
after all...
Either I have some fundamental misunderstanding on how streams work in C++ or std::codecvt
is poorly integrated in the standard. Maybe this is why it is marked as deprecated?
The
std::codecvt
facet was originally intended to handle I/O conversions between disk and memory character representation. Quoted from paragraph39.4.6
of Bjarne Stroustrup's The C++ Programming Language fourth edition:The intended purpose was thus to use
std::codecvt
only for adapting characters between file (disk) and memory, which partly answers your question:From the docs we see that:
Which then answers the question why
std::ofstream
(uses a file-based streambuffer) andstd::cout
(linked to standard output FILE stream) invokesstd::codecvt
.Now, to use the high-level
std::ostream
interface you need to provide an underlyingstreambuf
. Thestd::ofstream
provides afilebuf
and thestd::ostringstream
provides astringbuf
(which is not linked to the use ofstd::codecvt
). See this post over the streams, which also highlights the following:But, to invoke the character conversion functionality of a
std::codecvt
when you have astd::ostringstream
which is astd::ostream
with an underlyingstd::basic_streambuf
you can use, as indicated in your post, thestd::wbuffer_convert
.You have only used the
std::wstring_convert
in your second update and not thestd::wbuffer_convert
.When using the
std::wbuffer_convert
you can wrap the originalstd::ostringstream
with astd::ostream
as follows:Together with the complete example here, the output would be:
Conclusion
std::codecvt
was intended for converting between disk and memory representation. That is why thestd::codecvt
implementation is only called with streams using an underlyingfilebuf
such asstd::ofstream
andstd::cout
. However, a stream using an underlyingstringbuf
can be wrapped usingstd::wbuffer_convert
into astd::ostream
instance which would then invoke the underlyingstd::codecvt
.