How to improve stabilization of image sequence of face using Python OpenCV

638 views Asked by At

I have been capturing a photo of my face every day for the last couple of months, resulting in a sequence of images taken from the same spot, but with slight variations in orientation of my face. I have tried several ways to stabilize this sequence using Python and OpenCV, with varying rates of success. My question is: "Is the process I have now the best way to tackle this, or are there better techniques / order to execute things in?"

My process so far looks like this:

  • Collect images, keep the original image, a downscaled version and a downscaled grayscale version
  • Using dlib.get_frontal_face_detector() on the grayscale image, get a rectangle face containing my face.
  • Using the dlib shape-predictor 68_face_landmarks.dat, obtain the coordinates of the 68 face landmarks, and extract the position of eyes, nose, chin and mouth (specifically landmarks 8, 30, 36, 45, 48 and 54)
  • Using a 3D representation of my face (i.e. a numpy array containing 3D coordinates of an approximation of these landmarks on my real actual face in an arbitrary reference frame) and cv2.solvePnP, calculate a perspective transform matrix M1 to align the face with my 3D representation
  • Using the transformed face landmarks (i.e. cv2.projectPoints(face_points_3D, rvec, tvec, ...) with _, rvec, tvec = cv2.solvePnP(...)), calculate the 2D rotation and translation required to align the eyes vertically, center them horizontally and place them on a fixed distance from each other, and obtain the transformation matrix M2.
  • Using M = np.matmul(M2, M1) and cv2.warpPerspective, warp the image.

Using this method, I get okay-ish results, but it seems the 68 landmark prediction is far from perfect, resulting in twitchy stabilization and sometimes very skewed images (in that I can't remember having such a large forehead...). For example, the landmark prediction of one of the corners of the eye not always aligns with the actual eye, resulting in a perspective transform with the actual eye being skewed 20px down.

In an attempt to fix this, I have tried using SIFT to find features in two different photos (aligned using above method) and obtain another perspective transform. I then force the features to be somewhere around my detected face landmarks as to not align the background (using a mask in cv2.SIFT_create().detectAndCompute(...)), but this sometimes results in features only (or predominantly) being found around only one of the eyes, or not around the mouth, resulting again in extremely skewed images.

What would be a good way to get a smooth sequence of images, stabilized around my face? For reference, this video (not mine, which is stabilized around the eyes).

0

There are 0 answers