Detecting/correcting Photo Warping via Point Correspondences

133 views Asked by At

I realize there are many cans of worms related to what I'm asking, but I have to start somewhere. Basically, what I'm asking is:

Given two photos of a scene, taken with unknown cameras, to what extent can I determine the (relative) warping between the photos?

Below are two images of the 1904 World's Fair. They were taken at different levels on the wireless telegraph tower, so the cameras are more or less vertically in line. My goal is to create a model of the area (in Blender, if it matters) from these and other photos. I'm not looking for a fully automated solution, e.g., I have no problem with manually picking points and features.

Over the past month, I've taught myself what I can about projective transformations and epipolar geometry. For some pairs of photos, I can do pretty well by finding the fundamental matrix F from point correspondences. But the two below are causing me problems. I suspect that there's some sort of warping - maybe just an aspect ratio change, maybe more than that.

My process is as follows:

  1. I find correspondences between the two photos (the red jagged lines seen below).
  2. I run the point pairs through Matlab (actually Octave) to find the epipoles. Currently, I'm using Peter Kovesi's Peter's Functions for Computer Vision.
  3. In Blender, I set up two cameras with the images overlaid. I orient the first camera based on the vanishing points. I also determine the focal lengths from the vanishing points. I orient the second camera relative to the first using the epipoles and one of the point pairs (below, the point at the top of the bandstand).
  4. For each point pair, I project a ray from each camera through its sample point, and mark the closest covergence of the pair (in light yellow below). I realize that this leaves out information from the fundamental matrix - see below.

Two views of Plaza of Orleans

As you can see, the points don't converge very well. The ones from the left spread out the further you go horizontally from the bandstand point. I'm guessing that this shows differences in the camera intrinsics. Unfortunately, I can't find a way to find the intrinsics from an F derived from point correspondences.

In the end, I don't think I care about the individual intrinsics per se. What I really need is a way to apply the intrinsics to "correct" the images so that I can use them as overlays to manually refine the model.

Is this possible? Do I need other information? Obviously, I have little hope of finding anything about the camera intrinsics. There is some obvious structural info though, such as which features are orthogonal. I saw a hint somewhere that the vanishing points can be used to further refine or upgrade the transformations, but I couldn't find anything specific.

Update 1

I may have found a solution, but I'd like someone with some knowledge of the subject to weigh in before I post it as an answer. It turns out that Peter's Functions for Computer Vision has a function for doing a RANSAC estimate of the homography from the sample points. Using m2 = H*m1, I should be able to plot the mapping of m1 -> m2 over top of the actual m2 points on the second image.

The only problem is, I'm not sure I believe what I'm seeing. Even on an image pair that lines up pretty well using the epipoles from F, the mapping from the homography looks pretty bad.

I'll try to capture an understandable image, but is there anything wrong with my reasoning?

1

There are 1 answers

6
Jack Morrison On

A couple answers and suggestions (in no particular order):

  1. A homography will only correctly map between point correspondences when either (a) the camera undergoes a pure rotation (no translation) or (b) the corresponding points are all co-planar.
  2. The fundamental matrix only relates uncalibrated cameras. The process of recovering a camera's calibration parameters (intrinsics) from unknown scenes, known as "auto-calibration" is a rather difficult problem. You'd need these parameters (focal length, principal point) to correctly reconstruct the scene.
  3. If you have (many) more images of this scene, you could try using a system such as Visual SFM: http://ccwu.me/vsfm/ It will attempt to automatically solve the Structure From Motion problem, including point matching, auto-calibration and sparse 3D reconstruction.