I wrote a simple script to project 3D points into an image bases on the camera intrinsics and extrintics. But when I have a camera at the origin pointing down the z-axis and a 3D points further down the z-axis it appears to be behind the camera instead of in front of it. Here's my script, I've checked it so many times.
import numpy as np
def project(point, P):
Hp = P.dot(point)
if Hp[-1] < 0:
print 'Point is behind camera'
Hp = Hp / Hp[-1]
print Hp[0][0], 'x', Hp[1][0]
return Hp[0][0], Hp[1][0]
if __name__ == '__main__':
# Rc and C are the camera orientation and location in world coordinates
# Camera posed at origin pointed down the negative z-axis
Rc = np.eye(3)
C = np.array([0, 0, 0])
# Camera extrinsics
R = Rc.T
t = -R.dot(C).reshape(3, 1)
# The camera projection matrix is then:
# P = K [ R | t] which projects 3D world points
# to 2D homogenous image coordinates.
# Example intrinsics dont really matter ...
K = np.array([
[2000, 0, 2000],
[0, 2000, 1500],
[0, 0, 1],
])
# Sample point in front of camera
# i.e. further down the negative x-axis
# should project into center of image
point = np.array([[0, 0, -10, 1]]).T
# Project point into the camera
P = K.dot(np.hstack((R, t)))
# But when projecting it appears to be behind the camera?
project(point,P)
The only thing I can think of is that the identify rotation matrix doesn't correspond to the camera pointing down the negative z-axis with the up vector in the direction of the positive y-axis. But I can't see how this wouldn't be the case is for example I had constructed Rc from a function like gluLookAt and given it a camera at the origin pointing down the negative z-axis I would get the identity matrix.
I think the confusion is only in this line:
because these formulas assume the positive Z-axis goes into the screen, so actually a point with a positive Z value will be behind the camera:
I seem to recall this choice is arbitrary to make the 3D representation play well with our 2D preconceptions: if you assume your camera is looking in the -Z direction, then the negative X will be to the left when positive Y points up. And in this case, only things with negative Z will be in front of the camera.