I am having problems understanding the functionality of cv2.recoverPose(points1, points2)
. As of my understanding of the documentation, this function should return the rotation matrix from camera1 to camera2 as well as the translation from camera1 to camera2.
I am using an artificial set of points, points_3d
, as well as the transformation between two cameras, T_1_0
, to recover the camera transformation in Python. To get the points on the cameras, I calculate the projections of the 3D points onto the camera sensors:
import cv2
import numpy as np
def calc_projection(K_c, transform, pt_3d):
p_projected = np.hstack((K_c, np.zeros((3,1)))) @ np.linalg.inv(np.vstack((transform, [0,0,0,1]))) @ np.vstack((pt_3d.reshape(3,1), 1))
p_projected = p_projected[:2,:] / p_projected[2,:]
p_projected = p_projected.ravel()
return p_projected
points_3d = np.random.rand(100, 3)
K_c = np.eye(3)
T_0 = np.hstack((np.eye(3), np.zeros((3,1))))
rot_vec = np.array([0.2, 0.1, 0.3])
R_1_0, _ = cv2.Rodrigues(np.array(rot_vec))
t_0_10 = np.array([0.2, 0.4, 0.1])
T_1_0 = np.hstack((R_1_0, t_0_10.reshape(3,1)))
points1 = []
points2 = []
for pt_3d in points_3d:
points1.append(calc_projection(K_c, T_0, pt_3d))
points2.append(calc_projection(K_c, T_1_0, pt_3d))
points1 = np.array(points1)
points2 = np.array(points2)
E, mask = cv2.findEssentialMat(points1, points2, method=cv2.RANSAC)
inliers1 = points1[mask]
inliers2 = points2[mask]
_, R, t, _ = cv2.recoverPose(E, inliers1, inliers2)
r, _ = cv2.Rodrigues(R)
assert np.allclose(r, rot_vec)
assert np.allclose(t, t_0_10)
I would expect the result to be equal to T_1_0
(as of the assertion) but the result is:
r = [[0.20329041]
[0.15711541]
[0.37188371]]
t = [[0.50969714]
[0.79593836]
[0.32663581]]
What am I missing here? Why is it not working as expected? Am I doing something wrong or what is the expected behavior here?
Edit
The formula I've used for the projection wrongly introduces the inverse of the transformation. It should be the following instead:
p_projected = np.hstack((K_c, np.zeros((3,1)))) @ np.vstack((transform, [0,0,0,1])) @ np.vstack((pt_3d.reshape(3,1), 1))
Side Note
solvePnP
also solves the problem that I've been trying to solve here (3D->2D projection)
_, r, t = cv2.solvePnP(points_3d, points2, K_c, None)
assert np.allclose(r, rot_vec.reshape(3, 1), rtol=0.15)
assert np.allclose(t, t_0_10.reshape(3, 1), rtol=0.15)
But I still don't know why cv2.recoverPose
does not work? Looking at the documentation it should also return the translation and rotation...
TL;DR
Most important findings:
The
cv2.recoverPose(points1, points2,...)
function returns the rotation from camera1 to camera2 if points1 are found on camera1 and points2 are found on camera2.The returned translation vector is also from camera1 to camera2 but in the coordinate frame of camera1.
The translation vector is found only up to a factor which can't be recovered without further logic.
Setting the
cameraMatrix
onfindEssentialMat
is important.Description
I am now able to use
cv2.recoverPose
and recover the right rotation and translation. Here is the updated code:The results are
But! The documentation tells that the recovery of the translation is only possible up to a factor! So in this case, the following assertion works and the behavior is as expected!:
Comment: What's funny is that if I use
K_c = np.eye(3)
it sometimes works and sometimes not. But if I use the intrinsics of a real camera, the assertion always is true...