Binary to CSV record Converstion

3.8k views Asked by At

Hi Folks
I have been working on a python module which will convert a binary string into a CSV record. A 3rd Party application does this usually, however I'm trying to build this logic into my code. The records before and after conversion are as follows:

CSV Record After Conversion

0029.6,000.87,002.06,0029.2,0010.6,0010.0,0002.1,0002.3,00120,00168,00054,00111,00130,00000,00034,00000,00000,00039,00000,0313.1,11:09:01,06-06-2015,00000169

I'm trying to figure out the conversion logic that has been used by the 3rd party tool, if anyone can help me with some clues regarding this, it would be great!
One thing I have analysed is that each CSV value corresponds to an unsigned short in the byte stream.
TIA, cheers!

3

There are 3 answers

5
Martin Evans On BEST ANSWER

As already mentioned, without knowing the binary protocol it will be difficult to guess the exact encoding that is being used. There may be special case logic that is not apparent in the given data.

It would be useful to know the name of the 3rd party application or a least what field this relates to. Anything to give an idea as to what the encoding could be.

The following are clues as you requested on how to proceed:

  1. The end of the CSV shows a date, this can be seen at the start

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

  1. The end value 169 (hex A9) is suspiciously in between the next two hex values

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

  1. "00039," could refer to the last 4 digits

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

or:

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00 ....or 27 00 00 00

...you guess two bytes are used so perhaps the others are separate 0 value fields.

  1. "00034," could refer to:

31 08 11 06 06 15 20 AA A8 00 00 00 28 01 57 00 CE 00 24 01 6A 00 64 00 15 00 17 00 78 00 A8 00 36 00 6F 00 82 00 00 00 22 00 00 00 00 00 27 00 00 00

and so on... simply convert some of the decimal numbers into hex and search for possible locations in the data. Consider that fields might be big or little endian or a combination thereof.

You should take a look at the struct python library which can be useful in dealing with such conversions once you know the formatting that is being used.

With more data examples the above theories could then be tested.

1
Eason Ren On

From the binary into meaningful strings, we must know that the binary code protocol We can't resolve the binary out of thin air.

0
SShabtai On

Take a look at my Python script which converts a binary file to a CSV or BSV file given a C header file and a C struct name defining that binary record. https://github.com/SShabtai/MsgGini. Although not complete, it might give you some hints...