BSTR to CString conversion for Arabic text

1.2k views Asked by At

My VC++ (VS2008) project uses Multi-byte Character set.

I've the following code to convert a date string to COleDateTime

_bstr_t bstr_tDate = bstrDate; //bstrDate is populated by a COM function

const CString szStartDateTime = bstr_tDate.operator const char *();

bool bParseOK = oleDateTime.ParseDateTime(szStartDateTime);

This code works well in all regional settings, but fails in Arabic regional settings, where the input date is this format: 21/05/2012 11:50:31م

After conversion, the CString contains junk characters and parsing fails: 01/05/2012 11:50:28ã

Is there a BSTR to CString conversion that works in Arabic settings?

2

There are 2 answers

0
LihO On

BSTR is string consisting of UTF-16-encoded Unicode codepoints (wide "chars", 16-bit):

typedef WCHAR OLECHAR;
typedef OLECHAR* BSTR;

which means that special characters like 'م' are represented by single WCHAR. In multi-byte string (C-style char* or std::string) are these special characters represented by more characters (therefore it's called "multi-byte").

The reason why your CString contains junk characters is because you retrieve char* directly from _bstr_t. You need to convert this wide-char string to multi-byte string first. There are more ways how to do that, one of them is to use WideCharToMultiByte function.

This question will also help you: How do you properly use WideCharToMultiByte

0
Pavel Radzivilovsky On

What you are trying to do is possible with CString despite the MBCS setting, but it will only support Arabic.

It is probably much easier to start supporting all Unicode. This can be done without much damage to existing code (you can keep the std::string and char*) if you follow the instructions at the Windows section of utf8everywhere.org.