Is there a codecvt decoding scenario where the destination buffer needs space for more than one internal character?

112 views Asked by At

When using std::codecvt's in method to decode an external byte sequence to an internal char sequence, is there a situation where the destination buffer of internal chars needs space for more than one internal char?

Here is some code for reference:

// const std::locale& loc;
// mbstate_t state;
// const char *extern_buf_ptr;
// const char *extern_buf_eptr;
const std::codecvt<wchar_t, char, mbstate_t> *pcodecvt = &std::use_facet<std::codecvt<wchar_t, char, mbstate_t> >(loc);

wchar_t intern_char;
wchar_t *tmp;
std::codecvt_base::result in_res = pcodecvt->in(state,
        extern_buf_ptr, extern_buf_eptr, extern_buf_ptr,
        &intern_char, &intern_char + 1, tmp);

This is a simplification of some template code that I have written to decode bytes read individually from a Winsock SOCKET, where the user desires "unbuffered" input. Basically, with each iteration of a loop, a byte is read into the external buffer. The loop terminates when in_res is not std::codecvt_base::partial.

What I am wondering is: Is there a scenario where a call to in() would require space in the destination buffer for more than one internal character? I.e., is there a scenario that would make the above-described loop an infinite loop?

1

There are 1 answers

0
Cubbi On BEST ANSWER

There's a note in ยง22.4.1.4.2/3 to that extent:

basic_filebuf assumes that the mappings from internal to external characters is 1 to N: a codecvt facet that is used by basic_filebuf must be able to translate characters one internal character at a time

Sounds like any locale that's good for IO streams is good for your use as well.