Issue with presenting understandable predictions on a Python Keras CNN model

70 views Asked by At

Apologies if this is in the wrong place or formatting is incorrect in advance.

Having an issue that I'm having trouble finding an answer to as I may have it worded incorrectly during my search. I have a model created and working correctly- achieving 91.5% accuracy across 6 classes. Anyways to summarize my issue:

The goal is the classify waste images and the model has to predict what kind of waste it sees. The 6 classes, clear and coloured plastic bottles, clear and coloured plastic bags, cans and glass bottles. My expected results are to retrieve what the model predicts what it sees across the 6 classes, so 67% sure its a coloured bottle, 21% sure its a can etc etc.

The actual results I'm getting is a range of 6 exponential floating point numbers instead, which is not ideal and doesn't really indicate which class they belong to! As for errors I'm not getting any. Is there an issue with the way I've developed the classification code that would prevent more readable results or am I missing something?

I'm using Google Colab as my IDE and my model is DenseNet-201.

Thanks in advance, Jack

Here is the code I'm using to classify my real-world collected data using my trained model. Below this is the code showing the labels assigned into the array of wastes. My problem is I cannot trust these labels to be in the same order I'm receiving the floating point numbers in! Also to note, the images are being looped in from a folder on Google Drive. I have tried individual images but get the same results.

Code for classifying test images

# Morning Test
import numpy as np
from keras.preprocessing import image

width = 100
height = 100

new_dimensions = (width, height)

counter=0


print("Morning Test - Experiment 1 - Clear Bottle \n")

# Morning Test
# Cycle Throgh Images
for x in range (0,10):
  exp1_morning_waste1 = cv2.imread('/content/gdrive/My Drive/Rivers V2/Test Set/New Images/Exp 1/Morn/' + 'MorningBottleClExp1_' + str(x+1) +'.jpg')

  # Check for existence
  if exp1_morning_waste is not None:
    
    # Count the classifications add one
    counter+=1

    # Resize
    exp1_morning_waste = cv2.resize(exp1_morning_waste1, new_dimensions)

    # Add image to array
    exp1_morning_waste = image.img_to_array(exp1_morning_waste)

    # Axis, Dimens
    exp1_morning_waste = np.expand_dims(exp1_morning_waste, axis=0)
    exp1_morning_waste= exp1_morning_waste/255

    # Predict image
    prediction_prob = model.predict(exp1_morning_waste)

    # Print Predictions
    print(f'Probability that image is a: {prediction_prob} ')

    # Image Number
    print("Waste Item No." + str(x+1) +"\n")
    

  # No Directory or image present
  else:
    print("File not Contacted")
    break

Output>> Morning Test - Experiment 1 - Clear Bottle

Probability that image is a: [[9.9152815e-01 1.2046337e-03 1.4043533e-03 5.7380428e-03 6.7023984e-06 1.1799879e-04]] Waste Item No.1

and so on.....

Original Dataset labelling for training the model

# Create dataset and label arrays
wastedata=[]
labels=[]

# Set Random Number generator
random.seed(42)

# Access waste images directory
wasteDirectory = sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/")))

# Shuffle the directory
random.shuffle(wasteDirectory)

# Print directory class names
print(wasteDirectory)

# Resize and sort images in directory in the case they haven't already
for img in wasteDirectory:
    pathDir=sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img)))
    for i in pathDir:
        imagewaste = cv2.imread("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img+'/'+i)
        imagewaste = cv2.resize(imagewaste, (100,100))
        imagewaste = img_to_array(imagewaste)

        # Assign dataset to data array
        wastedata.append(imagewaste)
        l = label = img

        # Append to labels array
        labels.append(l)

Output>> ['Clear Plastic Bottle', 'Clear Glass Bottle', 'Clear Plastic Bags', 'Coloured Plastic Bags', 'Cans', 'Coloured Plastic Bottle']

1

There are 1 answers

4
Michael Hodel On BEST ANSWER

The model's predictions, i.e. these floating point numbers, are the probabilities for the respective classes (e.g. a value of 6.734e-1 = 6.734 * 10 ** (-1) indicating a probability of 67.34%). Your prediction is then the element in your array of classes at the index of the maximum value in your array of probabilities, meaning, you want to predict whichever class gets assigned the highest probability by your model. Example:

classes = ['Clear Plastic Bottle', 'Clear Glass Bottle', 'Clear Plastic Bags', 'Coloured Plastic Bags', 'Cans', 'Coloured Plastic Bottle']
probs = [9.9152815e-01, 1.2046337e-03, 1.4043533e-03, 5.7380428e-03, 6.7023984e-06, 1.1799879e-04]
max_prob = max(probabilities)
pred = classes[probabilities.index(max_prob)]
print(f'Model predicts a {max_prob*100:.2f}% chance of the item on the image being "{pred}".')

outputs

Model predicts a 99.15% probability of the item on the image being "Clear Plastic Bottle".