I would like to set up a CNN with keras and define a model for an image segmentation task of 2D grid maps.
I have training data which is represented as follows: The grid pattern as 2D ndarray with shape 40 x 17. Value range is {0, 1, 2}. A corresponding grid mask as ndarray of the same shape and with same value range representing the ground truth mask. My main question is about how the shape of the data shall be that I can use it e.g. for a pretrained MobileNetV2.
I have read and tried out a couple of image segmentation examples (e.g. https://www.tensorflow.org/tutorials/images/segmentation) with tensorflow and keras but all these examples are based on RGB images as training data instead of grid maps. I know that the data has to be one-hot encoded. Therefore I tried this:
train_images: (1100, 40, 17, 2)
train_masks: (1100, 40, 17, 3)
1100: batch size
40: grid size x-dimension
17: grid size y-dimension
2: number of classes of input images
3: number of classes of input masks
Can someone explain me how the input data shall look like? Can I stick to ndarrays or should I use tensorflow datasets?