Python-like Byte Array String representation in C#

Question

Python-like Byte Array String representation in C#

1.6k views Asked by Yunyu L. At 09 June 2015 at 01:04

Here is the method I am using to get this string representation:

    public static string ByteArrayToString(byte[] ba, string prefix)
    {
        StringBuilder hex = new StringBuilder(ba.Length * 2);
        foreach (byte b in ba)
        {
            if (prefix != null)
            {
                hex.Append(prefix);
            }
            hex.AppendFormat("{0:x2}", b);
        }
        return hex.ToString();
    }

Here is a sample string representation of a byte array (ByteArrayToString(arr, "\\x")):

\x00\x00\x00\x80\xca\x26\xff\x56\xbf\xbf\x49\x5b\x94\xed\x94\x6e\xbb\x7a\xd0\x9d
\xa0\x72\xe5\xd2\x96\x31\x85\x41\x78\x1c\xc9\x95\xaf\x79\x62\xc4\xc2\x8e\xa9\xaf
\x08\x22\xde\x22\x48\x65\xda\x1d\xca\x12\x99\x42\xb3\x56\xa7\x99\xca\x27\x7b\x2b
\x45\x77\x14\x5b\xe1\x75\x04\x3d\xdb\x68\x45\x46\x72\x61\x20\xa9\xa2\xd9\x50\xd0
\x63\x9b\x4e\x7b\xa4\xa4\x48\xd7\xa9\x01\xd1\x8a\x69\x78\x6c\x79\xa8\x84\x39\x42
\x32\xb3\xb1\x1f\x04\x4d\x06\xca\x2c\xd5\xa0\x45\x8d\x10\x44\xd5\x73\xdf\x89\x0c
\x25\x1d\xcf\xfc\xb8\x07\x6b\x1f\xfa\xae\x67\xf9\x00\x00\x00\x03\x01\x00\x01

Here is the representation I want (this is Python's, ignore different newline positions, this is all on a single line):

\x00\x00\x00\x80\xca&\xffV\xbf\xbfI[\x94\xed\x94n\xbbz\xd0\x9d\xa0r\xe5\xd2\x961
\x85Ax\x1c\xc9\x95\xafyb\xc4\xc2\x8e\xa9\xaf\x08"\xde"He\xda\x1d\xca\x12\x99B\xb
3V\xa7\x99\xca\'{+Ew\x14[\xe1u\x04=\xdbhEFra \xa9\xa2\xd9P\xd0c\x9bN{\xa4\xa4H\x
d7\xa9\x01\xd1\x8aixly\xa8\x849B2\xb3\xb1\x1f\x04M\x06\xca,\xd5\xa0E\x8d\x10D\xd
5s\xdf\x89\x0c%\x1d\xcf\xfc\xb8\x07k\x1f\xfa\xaeg\xf9\x00\x00\x00\x03\x01\x00\x0
1

The Python representation appears to convert bytes between (decimal) 32 and 126 to their ASCII representations instead of escaping all the bytes uniformly. How would I get the C# version to produce the same string output? I am reliant on a hash of this string output, so they need to be exactly identical.

Original Q&A

There are 2 answers

Michael B. On 15 November 2023 at 18:37

Matt's answer is correct and got me pointed in the right direction for my own usage, but there are a few cases that need to be accounted for that I had to figure out with trial and error.

Edit: This is for matching output from the str(byte) function in Python 3.12, it may not have been the case in 2015, but this was still the top answer when I was looking for help in 2023.

So far, 4 chars that need escaping as literals, not hex

The final format of your string may change based on the presence of single quotes and double quotes within the final hex string.

The Default output wraps the string like b'' BUT if the string contains a single quote, and no double quote, it is wrapped as b"" instead. If instead the string contains both ' and " , all single quotes in the string will be escaped and b"" is used.

examples

default: b'x\xc1o'

single quote/s present: b"x\xc1'o"

single quote/s and double quote/s present: b"x\xc1\'"o" Must escape single quote

foreach (byte b in ba)
{
    if (b == 92) hex.Append("\\\\");
    if (b == 10) hex.Append("\\n");
    else if (b == 13) hex.Append("\\r");
    else if (b == 9) hex.Append("\\t");
    else if (b >= 32 && b <= 126)
    {
        hex.Append((char) b);
        continue;
    }
    else
    {
        hex.Append("\\x");
        hex.AppendFormat("{0:x2}", b);
    }
}

string hexformat = hex.ToString();

if (hexformat.Contains("'") && !hexformat.Contains("\""))
{
     hexformat = "b\"" + hexformat + "\"";
}
else if (hexformat.Contains("'") && hexformat.Contains("\""))
{
    hexformat = hexformat.Replace("'", "\\'");
    hexformat = "b\'" + hexformat + "'";
}
else
{
     hexformat = "b\'" + hexformat + "'";
}

// Could be optimized by checking the bytearray first for the presence of ' 
// and " instead of doing a Replace at the end.

**Matt Johnson-Pint** · Accepted Answer · 2015-06-09T01:13:47+00:00

Well, if you're certain of the logic of the encoding, then you can just implement it:

foreach (byte b in ba)
{
    if (b >= 32 && b <= 126)
    {
        hex.Append((char) b);
        continue;
    }

    ...

If you're looking for performance though, you should check out this answer, and possibly make some adjustments to one of the methods listed there.

TechQA.

Python-like Byte Array String representation in C#

There are 2 answers

Related Questions in C#

Related Questions in ESCAPING

Related Questions in ARRAYS

Related Questions in ASCII

Popular Questions

Trending Questions