I'm currently experimenting with Kitti stereo dataset (http://www.cvlibs.net/datasets/kitti/eval_odometry.php), the goal is to find a correct matrix that projects pixels to points in world coordinate system. The problem is that I'm not sure which coordinate system is used in Kitti dataset. The readme file says:
Each file xx.txt contains a N x 12 table, where N is the number of frames of this sequence. Row i represents the i'th pose of the left camera coordinate system (i.e., z pointing forwards) via a 3x4 transformation matrix. The matrices are stored in row aligned order (the first entries correspond to the first row), and take a point in the i'th coordinate system and project it into the first (=0th) coordinate system. Hence, the translational part (3x1 vector of column 4) corresponds to the pose of the left camera coordinate system in the i'th frame with respect to the first (=0th) frame.
So as far as I understood this matrix represents a world to camera mapping, therefore I should take the inverse of it to project from camera c.s. to world c.s.?
Another issue is that I need the world coordinate system to be oriented differently: with -z pointing forward and and y upwards.
The current version of my code looks like this:
M = np.reshape(M, (3, 4))
#convert local to global c.s. (?)
Rc = M[..., :-1]
tc = M[..., -1]
R, t = Rc.T, -Rc.T.dot(tc.reshape(-1 ))
M[..., :-1] = Rc
M[..., -1] = tc
#convert to CG c.s.
M[0, 1] *= -1.0
M[1, 0] *= -1.0
M[0, 2] *= -1.0
M[2, 0] *= -1.0
M[1, 3] *= -1
M[2, 3] *= -1```
However, when I reproject pixels between consecutive left frames, the resulting pixels end up far beyond image boundaries.
The functions I use for reprojection work correctly, if the coordinates are given in world coordinate system.