DICOM files not recognized by Keras ImageDataGenerator in Tensorflow 2.3.1

893 views Asked by At

I am trying to read DICOM (.dcm) files which are located in the folder:

/rsna-pneumonia-detection-challenge/stage_2_train_images/

The files are readable and can be displayed using pidicom library. However, when I use Keras ImageDataGenerator to read the files for training pipelines, it gives the following error:

"UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f90a47e3b80>"

As I understand from detailed error, Python PIL (Pillow) cannot identify the file format ".dcm".

I need two insights:

  1. Is it possible to use Keras ImageDataGenerator to read DICOM files for training pipeline without the necessity to first convert these files to some other format like ".png"? I read the following blog at "https://medium.com/@rragundez/medical-images-now-supported-by-keras-imagedatagenerator-e67d1c2a1103" but it surprisingly uses ".PNG" as an example.

  2. Is it possible to write a custom generator that first extracts the pixel information from DICOM files and fetches the batch to training pipeline?

Any help would be highly appreciated.

My code and error message is given below.

import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation, Flatten, Dropout, BatchNormalization
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras import regularizers, optimizers

traindf=pd.read_csv(‘/rsna-pneumonia-detection-challenge/stage_2_train_labels.csv',dtype=str)
classdf=pd.read_csv('/rsna-pneumonia-detection-challenge/stage_2_detailed_class_info.csv',dtype=str)

tr100df = traindf[0:100] # take first 100 samples
tr100df.loc[:,'path'] = tr100df.patientId + '.dcm'

datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)

train_generator=datagen.flow_from_dataframe(
dataframe=tr100df,
directory="/rsna-pneumonia-detection-challenge/stage_2_train_images",
x_col="path",
y_col="Target",
subset="training",
batch_size=32,
seed=42,
shuffle=True,
class_mode="binary",
target_size=(32,32),mode='grayscale',validate_filenames=False)

for image_batch, labels_batch in train_generator:
  print(image_batch.shape)
  print(labels_batch.shape)
  image_np = image_batch.numpy()
  label_np = labels_batch.numpy()
  break

 

Error:

UnidentifiedImageError                    Traceback (most recent call last)
<ipython-input-66-9af954b10f7c> in <module>
----> 1 for image_batch, labels_batch in train_generator:
      2   print(image_batch.shape)
      3   print(labels_batch.shape)
      4   image_np = image_batch.numpy()
      5   label_np = labels_batch.numpy()

~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in __next__(self, *args, **kwargs)
    102 
    103     def __next__(self, *args, **kwargs):
--> 104         return self.next(*args, **kwargs)
    105 
    106     def next(self):

~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in next(self)
    114         # The transformation of images is not under thread lock
    115         # so it can be done in parallel
--> 116         return self._get_batches_of_transformed_samples(index_array)
    117 
    118     def _get_batches_of_transformed_samples(self, index_array):

~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/iterator.py in _get_batches_of_transformed_samples(self, index_array)
    225         filepaths = self.filepaths
    226         for i, j in enumerate(index_array):
--> 227             img = load_img(filepaths[j],
    228                            color_mode=self.color_mode,
    229                            target_size=self.target_size,

~/opt/anaconda3/lib/python3.8/site-packages/keras_preprocessing/image/utils.py in load_img(path, grayscale, color_mode, target_size, interpolation)
    112                           'The use of `load_img` requires PIL.')
    113     with open(path, 'rb') as f:
--> 114         img = pil_image.open(io.BytesIO(f.read()))
    115         if color_mode == 'grayscale':
    116             # if image is not already an 8-bit, 16-bit or 32-bit grayscale image

~/opt/anaconda3/lib/python3.8/site-packages/PIL/Image.py in open(fp, mode)
   2928     for message in accept_warnings:
   2929         warnings.warn(message)
-> 2930     raise UnidentifiedImageError(
   2931         "cannot identify image file %r" % (filename if filename else fp)
   2932     )

UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f90a47e3b80>
0

There are 0 answers