Writing a C# string to a preallocated unmanaged buffer using UTF8 encoding

793 views Asked by At

I need to write a C# string to a preallocated unmanaged buffer encoded as Utf8. Before answering, please read the following requirements:

  • No new allocations (so please, don't direct me to answers involving creating byte arrays or other instantiations)
  • No transitions to unmanaged code (no pinvoke/calli)

Currently, I'm using OpCodes.Cpblk to copy raw strings from C# to unmanaged buffers using 16 bit characters. This gives me roughly the same performance as using unmanaged memcpy on an x64 architecture and I really need the throughput to be close to that.

I am considering fixing the string as a char* and iterating over it, but implementing an encoder without jump tables would be both cumbersome and less than optimal when it comes to performance.

1

There are 1 answers

3
usr On BEST ANSWER

Use the unsafe overload

public override unsafe int GetChars(byte* bytes, int byteCount, char* chars, int charCount)

of the UTF8Encoding-class. You need to specify pointers to the string and the byte-buffer that will receive the chars. It will copy UTF-8 chars into it. No allocations will be happening but it will require unsafe code.