I'm writing a plugin for Autodesk Navisworks, trying to pass a C# unicode string to a property on a COM object. However, the string is encoded incorrectly somewhere in the process.
var property = ...;
property.Name = "中文"; // becomes "??"
property.Value = "中文"; // OK
"中文" comes out as "??" in the user interface, whereas strings limited to ASCII work just fine (e.g. "abcd"). Furthermore, setting the Value-property (a VARIANT) on the same object works just fine, but not the Name.
Further exploration leads me to try encoding the string "ä" as utf-8:
C3 A4
and somehow "encoding" this into a (unicode) string:
property.Name = "\u00c3\u00a4"; // shows up as "ä"
Surprisingly this seemed to work.
This led me to try the following:
var bytes = Encoding.UTF8.GetBytes("中文abcd");
char[] chars = new char[bytes.Length];
for(int i = 0; i < chars.Length; i++)
chars[i] = (char)bytes[i];
string s = new string(chars);
However, when I use this trying to encode "中文abcd" I only get the first character "中" in the GUI. Yet, with "äabcd" I get more than one character again...
What is happening here? How can I get around the problem? Is it a marshalling problem (e.g. incorrectly specified encoding in the COM Interop)? Or perhaps some weird code inside the application? If it's a marshalling problem, can I modify it for this property only?
Turns out that
Name
was an "internal" string, and I should have used the propertyUserName
for text displayed in the GUI.I.e. I changed:
to this:
which worked. Presumably UserName is implicitly set from Name internally in some way ignoring or mishandling the encoding.