I am trying to read DICOM (.dcm) files which are located in the folder:
/rsna-pneumonia-detection-challenge/stage_2_train_images/
The files are readable and can be displayed using pidicom library. However, when I use Keras ImageDataGenerator to read the files for training pipelines, it gives the following error:
"UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f90a47e3b80>"
As I understand from detailed error, Python PIL (Pillow) cannot identify the file format ".dcm".
I need two insights:
Is it possible to use Keras ImageDataGenerator to read DICOM files for training pipeline without the necessity to first convert these files to some other format like ".png"? I read the following blog at "https://medium.com/@rragundez/medical-images-now-supported-by-keras-imagedatagenerator-e67d1c2a1103" but it surprisingly uses ".PNG" as an example.
Is it possible to write a custom generator that first extracts the pixel information from DICOM files and fetches the batch to training pipeline?
Any help would be highly appreciated.
My code and error message is given below.
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation, Flatten, Dropout, BatchNormalization
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import regularizers, optimizers
traindf=pd.read_csv(‘/rsna-pneumonia-detection-challenge/stage_2_train_labels.csv',dtype=str)
classdf=pd.read_csv('/rsna-pneumonia-detection-challenge/stage_2_detailed_class_info.csv',dtype=str)
tr100df = traindf[0:100] # take first 100 samples
tr100df.loc[:,'path'] = tr100df.patientId + '.dcm'
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
train_generator=datagen.flow_from_dataframe(
dataframe=tr100df,
directory="/rsna-pneumonia-detection-challenge/stage_2_train_images",
x_col="path",
y_col="Target",
subset="training",
batch_size=32,
seed=42,
shuffle=True,
class_mode="binary",
target_size=(32,32),mode='grayscale',validate_filenames=False)
for image_batch, labels_batch in train_generator:
print(image_batch.shape)
print(labels_batch.shape)
image_np = image_batch.numpy()
label_np = labels_batch.numpy()
break
Error:
UnidentifiedImageError Traceback (most recent call last)
<ipython-input-66-9af954b10f7c> in <module>
----> 1 for image_batch, labels_batch in train_generator:
2 print(image_batch.shape)
3 print(labels_batch.shape)
4 image_np = image_batch.numpy()
5 label_np = labels_batch.numpy()
~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in __next__(self, *args, **kwargs)
102
103 def __next__(self, *args, **kwargs):
--> 104 return self.next(*args, **kwargs)
105
106 def next(self):
~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in next(self)
114 # The transformation of images is not under thread lock
115 # so it can be done in parallel
--> 116 return self._get_batches_of_transformed_samples(index_array)
117
118 def _get_batches_of_transformed_samples(self, index_array):
~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in _get_batches_of_transformed_samples(self, index_array)
225 filepaths = self.filepaths
226 for i, j in enumerate(index_array):
--> 227 img = load_img(filepaths[j],
228 color_mode=self.color_mode,
229 target_size=self.target_size,
~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/utils.py in load_img(path, grayscale, color_mode, target_size, interpolation)
112 'The use of `load_img` requires PIL.')
113 with open(path, 'rb') as f:
--> 114 img = pil_image.open(io.BytesIO(f.read()))
115 if color_mode == 'grayscale':
116 # if image is not already an 8-bit, 16-bit or 32-bit grayscale image
~/opt/anaconda3/lib/python3.8/site-packages/PIL/Image.py in open(fp, mode)
2928 for message in accept_warnings:
2929 warnings.warn(message)
-> 2930 raise UnidentifiedImageError(
2931 "cannot identify image file %r" % (filename if filename else fp)
2932 )
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f90a47e3b80>