Byte array to struct

4.2k views Asked by At

I'm having trouble converting the string parts of the byte array.

My struct looks like this:

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct Message
{
    public int id;

    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
    public string text;
}

Creation of test byte array:

private static byte[] CreateMessageByteArray()
{
    int id = 69;
    byte[] intBytes = BitConverter.GetBytes(id);

    string text = "test";
    byte[] stringBytes = GetBytes(text);

    IEnumerable<byte> rv = intBytes.Concat(stringBytes);

    return rv.ToArray();
}

Method to convert my bytearray to a struct:

static T ByteArrayToStructure<T>(byte[] bytes) where T : struct
{
    var handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    var result = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();
    return result;
}

When I call ByteArrayToStructure with the result from CreateMessageByteArray() I get a struct with an id=60 and text="t".

Why don't I get the whole string e.g "test" ?

Edit: This is the code I forgot to pase:

    static byte[] GetBytes(string str)
    {
        byte[] bytes = new byte[str.Length * sizeof(char)];
        System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
        return bytes;
    }
3

There are 3 answers

1
Theodoros Chatzigiannakis On BEST ANSWER

The problem is at this line:

byte[] stringBytes = GetBytes(text);

How are you converting the string to a byte array? You are probably using a Unicode encoding, which will store each character as two bytes, and because your string is in the ASCII set, every other byte will be zero:

byte[] stringBytes = new UnicodeEncoding().GetBytes(text);
// will give you { 't', '\0', 'e', '\0', 's', '\0', 't', '\0' }

These zeroes mislead the marshalling mechanism into assuming they are terminal characters and so the string ends just after the 't'.

Instead, you can use an ASCII encoding (which stores one byte per character):

byte[] stringBytes = new ASCIIEncoding().GetBytes(text);
// will give you { 't', 'e', 's', 't' }
// but will lose non-ASCII character information

Or you can use a UTF8 encoding (which is variable length):

byte[] stringBytes = new UTF8Encoding().GetBytes(text);
// will give you { 't', 'e', 's', 't' }    
// and retain non-ASCII character information, but it's somewhat
// trickier to rebuild the string correctly in case of non-ASCII
// information present
0
Andrei Tătar On

Maybe GetBytes method doesn't work as you expect. This linqpad works fine for me:

void Main()
{
    var result = ByteArrayToStructure<Message>(CreateMessageByteArray());
    result.Dump();
}

[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct Message
{
    public int id;

    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
    public string text;
}

private static byte[] CreateMessageByteArray()
{
    int id = 69;
    byte[] intBytes = BitConverter.GetBytes(id);

    string text = "test";
    byte[] stringBytes = Encoding.UTF8.GetBytes(text);

    IEnumerable<byte> rv = intBytes.Concat(stringBytes);

    return rv.ToArray();
}

static T ByteArrayToStructure<T>(byte[] bytes) where T : struct
{
    var handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    var result = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();
    return result;
}

Output:

id    69 
text  test 
0
tjleigh On

In addition to the other two answers, if you want the string in the text field to always be Unicode, you could include CharSet = CharSet.Unicode in your [StructLayout] attribute