Why can't I use Image.open on PIL files?

130 views Asked by At

I have written a simple code to download an audio dataset from ccmusic on hugging face. Problem is that somehow I can't open PIL images from said dataset with Image.Open() ... Can somebody explain why that is? And how to fix it?

If I run my code:

import datasets
import PIL
from PIL import Image
from datasets import load_dataset

dataset = load_dataset("ccmusic-database/music_genre", split="test")
output = dataset
im = Image.open(output[0])
im.show()

I get the following error:

Traceback (most recent call last): File "/Users/abc/Desktop/Project Python Audio/.venv/lib/python3.11/site-packages/PIL/Image.py", line 3247, in open fp.seek(0) ^^^^^^^ AttributeError: 'dict' object has no attribute 'seek' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Users/abc/Desktop/Project Python Audio/loading_dataset.py", line 8, in im = Image.open(output[0]) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/abc/Desktop/Project Python Audio/.venv/lib/python3.11/site-packages/PIL/Image.py", line 3249, in open fp = io.BytesIO(fp.read()) ^^^^^^^ AttributeError: 'dict' object has no attribute 'read'

However, if I print the file with:

import datasets
import PIL
from PIL import Image
from datasets import load_dataset

dataset = load_dataset("ccmusic-database/music_genre", split="test")
output = dataset
print(output[0])

I get:

{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=476x349 at 0x101BE6E10>, 'fst_level_label': 1, 'sec_level_label': 6, 'thr_level_label': 6, 'duration': 416.0533106575964}

So it seems the PIL / Jpeg file array is located at the right position at output[0] ... but Image.open is unable to display it ... what is going on here and how can I watch the image?

2

There are 2 answers

0
Mark Setchell On BEST ANSWER

If you look at the type of dataset[0] you will see it is a dict:

print(type(dataset[0]]
dict

If you print it:

print(dataset[0])
{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=476x349 at 0x101BE6E10>,
'fst_level_label': 1,
'sec_level_label': 6,
'thr_level_label': 6,
'duration': 416.0533106575964}

you can see it is a dict because it is in curly braces. The "keys" are given in the left column and the corresponding "values" are on the right.

That means if you access dataset[0]["image"] the object you are looking at is already a PIL Image so you can copy it and use it exactly the same as if you'd created it with Image.open().

0
Laulito On

Got the code working (see below) thanks to @slothrop

import datasets
import PIL
from PIL import Image
from datasets import load_dataset

dataset = load_dataset("ccmusic-database/music_genre", split="test")
output = dataset
im = output[0]['image']
im.show()