Behavior of `swprintf` when passed a `char const*` matching a `L"%s"` specifier

5.8k views Asked by At

I'm writing an Excel plugin, and need to generate wchar_t output for Excel (although internally, we are 100% char, and in fact limit char to plain ASCII). At one point, I'm using swprintf to do the conversion:

static wchar_t buffer[ 32369 ];
buffer[0] = swprintf( buffer + 1, sizeof(buffer) - 1, L"#%s!", message );

Excel displays some sort of CJK characters, although message (type char const*) is a null terminated character string with no characters outside of printable ASCII (hex values 0x20-0x7E).

I've tried this in a small test program, dumping in hex the generated string, and it looks like VC++ is treating message as if it were a wchar_t const* (although it seems to recognized the '\0' correctly, although it is on a single byte); this results in wchar_t with values like 0x6568 (rather than the 0x0068, 0x0065 that I was expecting).

According to the C99 standard, for a "%s" specifier, swprintf should convert the characters from the char const* "as if by repeated calls to the mbrtowc function[...]". Is the behavior I am seeing is an error in the Visual C++ library, or whether there is something in the global locale that I have to change?

(FWIW: when I compile and run my small test program with g++, I get the behavior I expect. G++ is not, however, an option for our Excel plugins, at least not at present.)

2

There are 2 answers

7
xanatos On BEST ANSWER

Note that from swprintf of MSDN:

swprintf is a wide-character version of sprintf; the pointer arguments to swprintf are wide-character strings.

and then in the example:

wchar_t buf[100];
int len = swprintf( buf, 100, L"%s", L"Hello world" );

so at least Microsoft documented this.

And then in the page of format specifiers

s String When used with printf functions, specifies a single-byte–character string; when used with wprintf functions, specifies a wide-character string. Characters are printed up to the first null character or until the precision value is reached.

And then

S String When used with printf functions, specifies a wide-character string; when used with wprintf functions, specifies a single-byte–character string. Characters are printed up to the first null character or until the precision value is reached.

So what you want is upper-case %S.

See even this similar question: visual studio swprintf is making all my %s formatters want wchar_t * instead of char * where they suggest using %ls (always consider the parameter wchar_t*) and %hs (always consider the parameter char*)

0
krisku On

When calling swprintf the specifier %s is interpreted as pointing to a wide string, i.e. a wchar_t pointer. Instead use the %S (uppercase S) format specifier, as that will correctly use the char* message you are passing.

From Microsoft's documentation on printf Type Field Characters:

  • s, String, When used with printf functions, specifies a single-byte–character string; when used with wprintf functions, specifies a wide-character string. Characters are printed up to the first null character or until the precision value is reached.
  • S, String, When used with printf functions, specifies a wide-character string; when used with wprintf functions, specifies a single-byte–character string. Characters are printed up to the first null character or until the precision value is reached.