Using Visual C++/MSXML how do I convert XML from ISO-8859-1 to UTF-8?

350 views Asked by At

I have a C++ XML document, document A, that is retrieved from a database and placed in MSXML 4 as DOM Document. The document is in ISO-8859-1 encoding, and it has non-ASCII characters, such as é (0xE9 in ISO-8859-1). Some of the document A nodes are copied into a newly created MSXML document, document B, that we want with UTF-8 encoding because that's what the recipient expects. Creating document B and setting processing instruction with encoding as UTF-8 and then copying the node from the document A does not cause the é to be in UTF-8 format (0XC3 0XA9). Is there another way using MSXML to let it convert without using stylesheets? Some of the documents would be in megabytes and may add additional processing time. Is there a way to do it by manipulating the XML as flat string? We work in wchar_t based strings (we don't use MFC) and I have been looking into some Windows API but that seems to take regular char and I am not sure yet if we would lose anything, and that's what I will be testing.

Thanks, Niraj

0

There are 0 answers