I am currently implementing a facial recognition model using ResNet, while applying the concept of embeddings and triplet loss from the FaceNet paper. However, I am experiencing fluctuating accuracy but relatively constant loss (see image in link below). Specifically, I set the margin (α) to 0.2, and the relatively constant loss value of 0.2 suggests that my model predicts the embeddings to be very close to zero for all images (because d_p - d_n + α ≈ α = 0.2). I am unsure why my model refuses to make the embedding for the negative image far away from the anchor image. Instead, it makes all embeddings very close to one another, although embeddings of images from the same person are made slightly closer to one another than embeddings of images from a different person. This, however, will have implications as my model would likely be unable to detect if a person is an outsider, as all embeddings are made rather close to one another.
I am wondering if anyone here have faced similar issue before, and would be kind to provide me with advice on how I can overcome the issue above?
For training of my model, I used both semi-hard and hard triplets, but lower the learning rate from 0.001 to 0.0001 from the 200th epochs onwards. I also used around 3 images from 100 people each for the training.