Size needed to construct a char* out of wchar_t* in C

804 views Asked by At

I'm trying to convert a wchar_t* to char* and my memory wasting solution was

char *wstrtostr(const wchar_t *text) {
    size_t size = wcslen(text)*sizeof(wchar_t)+1;
    char *sa = malloc(size);
    wcstombs(sa,text,size);
    return sa;
}

A character might be single-byte or multi-byte and wcslen will count them regardless of their equivalent size as chars.

The question is how can we determine the equivalent char size for a wchar so that we can build an alternative to wcslen for this specific problem and consequently determine the size required to build our char pointer?

1

There are 1 answers

4
user133831 On

To answer what you asked, you can repeatedly call wcstombs with bytes slowly increasing until you get something stored. Not sure however how efficient that is for what you seem to want to do though. Maybe you would want a different approach:

Allocate some memory. Call wcsrtombs. If src doesn't end up being NULL then you ran out of memory so realloc and call wcsrtombs again from where it left off last time.

Depending on your data you can build a heuristic for how much memory to allocate in the first place so reallocing is rare.

Update: It turns out that if you are running under Linux, and don't require portability or C99 compliance, there exists another method. If you call wcstombs with NULL as the destination then it will return the number of bytes that would have been required. You can then alloc this number of bytes and call wcstombs again. Which approach will be better will depend on your circumstances, specifically I imagine the length of the string and how good your heuristic is at guessing the correct length first go. Also, just to reiterate, if you code needs to be portable then this is a non-standard API. Thanks to melpomene for the pointer.

Second update: wcsrtombs does support, according to C99, having its dest pointer set to NULL to get the length required for the output buffer. Thanks to Story Teller for that. So you could call that once with NULL, and then a second time with an appropriately sized buffer.