Why can't I marshal UCS-4 strings properly in C#?

103 views Asked by At

I am trying to marshal a hid_device_info struct in C#, but I can't figure out how to translate the wchar_t* strings to managed C# strings. I have tried all possible values in the MarshalAs attribute, but all of them returned the first character only and nothing else.

I have tried replacing all the wide strings with pointers so I can manually look at them, this is the struct that I have so far:

public struct HidDeviceInfo
{
    public IntPtr path; // This one marshals fine because it's just a regular char_t*
    public ushort vendor_id;
    public ushort product_id;
    public IntPtr serial_number; // wchar_t*
    public ushort release_number;
    public IntPtr manufacturer_string; // wchar_t*
    public IntPtr product_string; // wchar_t*
    public ushort usage_page;
    public ushort usage;
    public int interface_number;
    public IntPtr next;
}

When I manually iterate through one of the pointers (serial_number for example), I can see that all the characters have 4 bytes (1 ascii byte followed by 3 zeros). I have tried all the possible Marshal.PtrToString... methods, but none of them are able to retrieve the full string.

I have a suspicion that the strings are being treated as 2 byte characters since I can't specify the character width anywhere in C#, and this is why it stops after the first character. Of course, by knowing this, I could easily write my own string marshaler, but I feel like there must be an existing solution and I'm overlooking something obvious.

This struct is coming from a P/Invoked function and Marshal.PtrToStructure:

[DllImport(LibUsbName, CharSet = CharSet.Unicode)]
public static extern IntPtr hid_enumerate(ushort vendorId, ushort productId);

I've also tried all the possible CharSet values.

This can't be a character type mismatch, as it was in this question, because I've tried all possible combinations of different character types.

1

There are 1 answers

0
Lázár Zsolt On

I ended up writing this method that works fine for me, but only if all character are ASCII and the char width is guaranteed to be 4 bytes.

private static string ToUcs4String(this IntPtr ptr)
{
    var builder = new StringBuilder();
    var buffer = new byte[4];
    while (true)
    {
        Marshal.Copy(ptr, buffer, 0, 4);
        if (buffer[0] == 0)
            break;
        builder.Append((char) buffer[0]);
        ptr += 4;
    }

    return builder.ToString();
}