Attempt to read a binary file in python. From the dataset page:
The pixels are stored as unsigned chars (1 byte) and take values from 0 to 255
I have tried the following, which prints (0,), rather than a 784,000 digit array.
# -*- coding: utf8 -*-
# Processed MNIST dataset (http://cis.jhu.edu/~sachin/digit/digit.html)
import struct
f = open('data/data0', mode='rb')
data = []
print struct.unpack('<i', f.read(4))
How can I read this binary into either a 784,000 digit array (28 bytes x 28 bytes x 1k samples), or a 28x28x1000 3D array. I have never worked with binaries before, and am quite confused!
f.read()will get you an immutable array of 784,000 bytes (called astrin Python 2). If you need it to be mutable, you can use thearraymodule and its array type capable of storing various primitives, unsigned bytes (represented by theBcode) included:This can be sliced as necessary: