I am trying to load various raw (AWR) images inside a TF dataset for training a model. Basically, I initially had 2 lists:
im1
: This has the image file paths that will be input to the model.
im2
: This is the expected output.
I am creating the dataset as follows:
ds_train = tf.data.Dataset.from_tensor_slices((im1, im2))
Now, this dataset would contain all the paths. To load the raw images from the files, I am using a mapping function as follows:
def read_image(im1, im2):
im1 = rawpy.imread(im1).raw_image_visible.astype(np.float32)
im2 = rawpy.imread(im2).raw_image_visible.astype(np.float32)
return im1, im2
ds_train = ds_train.map(read_image)
This is giving me an error that seems to be associated with the rawpy module:
AttributeError Traceback (most recent call last)
...
AttributeError: in user code:
File "/tmp/ipykernel_24/2296991765.py", line 6, in read_image *
short = rawpy.imread(short).raw_image_visible.astype(np.float32)
File "/opt/conda/lib/python3.7/site-packages/rawpy/__init__.py", line 20, in imread *
d.open_file(pathOrFile)
File "rawpy/_rawpy.pyx", line 408, in rawpy._rawpy.RawPy.open_file **
AttributeError: 'Tensor' object has no attribute 'encode'
When I try to extract the string value of the path from im1
and im2
using the .numpy()
method, I get a new error that seems to suggest that the .numpy()
method doesn't exist:
AttributeError Traceback (most recent call last)
...
AttributeError: in user code:
File "/tmp/ipykernel_24/2505255456.py", line 6, in read_image *
short = rawpy.imread(short.numpy()).raw_image_visible.astype(np.float32)
AttributeError: 'Tensor' object has no attribute 'numpy'
The modification I did to my code was:
def read_image(im1, im2):
im1= rawpy.imread(im1.numpy()).raw_image_visible.astype(np.float32) # numpy method added
im2= rawpy.imread(im2.numpy()).raw_image_visible.astype(np.float32) # numpy method added
return short, long
ds_train = ds_train.map(read_image)
The full code may be seen in this notebook: https://www.kaggle.com/code/rohan843/learning-to-see-in-the-dark-tf2/notebook
Note: In the notebook above, I have used short
and long
instead of im1
and im2
. The error causing part is currently commented out.
You can't just use arbitrary functions/modules in a tf.data map method as it runs in Graph mode. For example:
is related running in Graph mode. You can use
tf.py_function
but this can cause slowdowns.When you use
tf.py_function
you can apply operations on a tf.data pipeline as if it is working in Eager mode.