I'm trying to convert MNIST data to png format according to the data format said in http://yann.lecun.com/exdb/mnist/
Below is the format of TRAINING SET IMAGE FILE (train-images-idx3-ubyte):
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
And this is my code. I use struct to unpack the data set and try to print the first 4 32-bits integers in the data set.
from PIL import Image
import struct
def read_image(filename):
f = open(filename, 'rb')
index = 0
buf = f.read()
magic, images, rows, columns = struct.unpack_from('>IIII' , buf , index)
index += struct.calcsize('>IIII')
print(magic, images, rows, columns)
f.close()
# for i in range(images):
# #for i in xrange(2000):
# image = Image.new('L', (columns, rows))
# for x in range(rows):
# for y in range(columns):
# image.putpixel((y, x), int(struct.unpack_from('>B', buf, index)[0]))
# index += struct.calcsize('>B')
# print('save ' + str(i) + 'image')
# image.save('test/' + str(i) + '.png')
if __name__ == '__main__':
read_image('train-images-idx3-ubyte.gz')
But the output is totally wrong:
529205256 2055376946 226418 1634299437
I realized that I forget to extract the "train-images-idx3-ubyte.gz".After extraction, I got a file named "train-images.idx3-ubyte", replaced "train-images-idx3-ubyte.gz" with this new file name, finally it worked.