Read binary file of logging data and output to new file with int (python)

2.3k views Asked by At

I've been working on an embedded software project that writes sensor data to SD card using FATFS module. The datatype of the data is uint32_t (4 bytes) and the output is binary file.

I'm trying to write a python script to read the binary file (and parse the data to int and write to a new file). My current code,

def read():
with open("INPUT1.TXT", "rb") as binary_file:
    # Read the whole file at once
    data = binary_file.read()
    print(data)

and that gives me a chunk of value in hex,

    b'    \x01   \x02   \x03   \x04   \x05   \x06   \x07   \x08   \t   \n   \x0b   \
x0c   \r   \x0e   \x0f   \x10   \x11   \x12   \x13   \x14   \x15   \x16   \x17
 \x18   \x19   \x1a   \x1b   \x1c   \x1d   \x1e   \x1f       \x01   \x02   \x03
  \x04   \x05   \x06   \x07   \x08   \t   \n   \x0b   \x0c   \r   \x0e   \x0f
\x10   \x11   \x12   \x13   \x14   \x15   \x16   \x17   \x18   \x19   \x1a   \x1
b   \x1c   \x1d   \x1e   \x1f      '

When printing each 4 bytes, some numbers are even missing,

f = open("INPUT2.TXT", "rb")
try:
    bytes_read = f.read(4)
    while bytes_read:
        print(bytes_read)
        bytes_read = f.read(4)
finally:
    f.close()

give result of

b'    '       #supposed to be \x00
b'\x01   '
b'\x02   '
b'\x03   '
b'\x04   '
b'\x05   '
b'\x06   '
b'\x07   '
b'\x08   '
b'\t   '      #supposed to be \x09
b'\n   '      #supposed to be \x0a
b'\x0b   '
b'\x0c   '
b'\r   '      #supposed to be \x0d
b'\x0e   '
b'\x0f   '
b'\x10   '
b'\x11   '
b'\x12   '
b'\x13   '
b'\x14   '
b'\x15   '
b'\x16   '
b'\x17   '
b'\x18   '
b'\x19   '
b'\x1a   '
b'\x1b   '
b'\x1c   '
b'\x1d   '
b'\x1e   '
b'\x1f   '

But when I read the binary file in a hex editor, all the binary appears to be correct?!

If I want to read 4 bytes at a time, and write to a new file (in type int), how could I achieve it?

Thanks,

Henry

3

There are 3 answers

2
jacoblaw On BEST ANSWER
nums = []
with open("INPUT2.TXT", "rb") as file:
    while byte:
        byte = file.read(4)
        nums.append(int.from_bytes(byte, byteorder="little"))

This should do it for python 3.

It looks like your bytes are flipped from your example, so I changed byte order to little. if they aren't flipped, then change it back to big.

Another weird thing: it looks like 0x00 is getting turned into b" ", instead of b"\x00". if that's the case, then do this instead:

nums = []
with open("INPUT2.TXT", "rb") as file:
    while byte:
        byte = file.read(4)
        nums.append(int.from_bytes(byte.replace(b" ", b"\x00"), byteorder="little"))

Here's an example with what you provided.

>>> test = [b'    ',
b'\x01   ',
b'\x02   ',
b'\x03   ',
b'\x04   ',
b'\x05   ',
b'\x06   ',
b'\x07   ',
b'\x08   ',
b'\t   ',
b'\n   ',
b'\x0b   ',
b'\x0c   ',
b'\r   ',
b'\x0e   ',
b'\x0f   ',
b'\x10   ',
b'\x11   ',
b'\x12   ',
b'\x13   ',
b'\x14   ',
b'\x15   ',
b'\x16   ',
b'\x17   ',
b'\x18   ',
b'\x19   ',
b'\x1a   ',
b'\x1b   ',
b'\x1c   ',
b'\x1d   ',
b'\x1e   ',
b'\x1f   ']

>>> for t in test:
>>>     print(int.from_bytes(t.replace(b" ", b"\x00"),  byteorder="little"))
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
Gribouillis On

You could perhaps do it with

for i in range(0, len(data), 4)
    d = struct.unpack('I', data[i:i+4])
    print(d)
1
shiv On

If it's just uint32_t numbers packed into a binary file, I think you can use the read() function on the file

num_list = []
with open("INPUT1.TXT", "rb") as binary_file:
    byte_data = 0x1 # Initial placeholder for the loop
    while byte_data:  
        byte_data = binary_file.read(4) # 4 being the number of bytes to read at a time
        num_list.append(int(byte_data))
#  Do something with num_list