C++: Native to Managed String Conversion Problem (Maybe Character Set)?

190 views Asked by At

I'm having a problem returning a native string in the correct character-set. I convert from a string to a wstring to an LPCWSTR to pass back to managed. For the string to wide-string, the s2ws method produces a very small string return because it seems to stop at my first would-be terminator (in managed), which is ';'. So, before you mention s2ws, I've alreday tried it to no avail.

String stuff:

    char target[1024];
    sprintf_s(target, 1024, "%s %s%s%s",
            mac,
            " (",
            pWLanBssList->wlanBssEntries[t].dot11Ssid.ucSSID,
            ");");
    std::string targetString = std::string(target);
    targetWString.append(targetString.begin(), targetString.end());

Later string stuff:

std::wstring returnWString = L"";
returnWString.append(SomeMthod().c_str());
//wprintf_s(returnWString.c_str()); // Works - Data is in the string.
LPCWSTR returnLpcuwstr = returnWString.c_str();
return returnLpcuwstr;

How do I know it's a character set/encoding problem? Well, when the LPCWSTR is returned to managed and I use Marshal to Unicode string, I get a wall of null/empty characters. When I try it in ANSI, this is what I get (reduced in size/scale for readability):

ÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝ

The s2ws method is supposed to address the ANSI/UNICODE nightmare that is std::string->std::wstring but that makes the return far too short - far shorter than it should be - but doesn't address the actual charset problem.

Result (to ANSI, again - no reducing done on my part): ÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝ

When I check in native, wprintf_s shows me that the string is valid/good before the LPCWSTR conversion happens in the export method; so, I need to understand:

  1. Is there a way for me to tell what the byte size of the characters actually are? (I'm thinking that this is an 8byte versus 16byte scenario?)
  2. Since wprintf_s works on the wide string, I checked it against the LPCWSTR and it printed the same (expected) data; so, the issue doesn't appear to be in the .ctor() of the LPCWSTR. Yet, I want to double-check my maths: Am I LPCWSTR'ing correctly?
  3. Since everything in native is telling me that the string is good, how can I check it's character-set (in native)?

The return, itself, is about 8 lines of text, with a delimiter ';' used so that I can split the string in managed and do magic with it. The only issue is getting the string to render as a valid string in managed, with the correct characters in it.

I feel like, maybe, I'm missing something obvious here but I cannot figure out what it is and just need a fresh pair of eyes to tell me where and how I'm failing at life.

1

There are 1 answers

3
Remy Lebeau On BEST ANSWER
LPCWSTR returnLpcuwstr = returnWString.c_str();
return returnLpcuwstr;

This is returning a pointer to data that gets freed immediately after the return, when returnWString goes out of scope. The returned pointer is invalid before the receiver can even use it. This is undefined behavior.

To do what you are attempting, you will have to return a pointer to dynamically allocated memory, and then the receiver will have to free that memory when done using it.

Assuming by "managed" you are referring to .NET, then .NET's marshaller frees unmanaged memory using CoTaskMemFree(), so if you are using default marshaling, the returned pointer must be pointing at memory that is allocated using CoTaskMemAlloc() or equivalent (SysAllocString...(), for instance).

Otherwise, if you are not using default marshaling (ie, you are calling Marshal.PtrToStringUni() manually instead), then you will have to make the .NET code pass the memory pointer back to your C++ code so it can then free the memory properly. Then your C++ code can allocate the memory however you want to (as long as it is still allocated dynamically so it can survive past the function return).