Fast way to find corresponding objects across stereo views

318 views Asked by At

Thanks for taking your time to read this.

We have fixed stereo pairs of cameras looking into a closed volume. We know the dimensions of the volume and have the intrinsic and extrinsic calibration values for the camera pairs. The objective being to be able to identify the 3d positions of multiple duplicate objects accurately. Which naturally leads to what is described as the correspondence problem in litrature. We need a fast technique to match ball A from image 1 with Ball A from image 2 and so on. At the moment we use the properties of epipolar geomentry (Fundamental matrix) to match the balls from different views in a crude way and works ok when the objects are sparse, but gives a lot of false positives if the objects are densely scattered. Since ball A in image 1 can lie anywhere on the epipolar line going across image 2, it leads to mismatches when multiple objects lie on that line and look similar.

Is there a way to re-model this into a 3d line intersection problem or something? Since the ball A in image 1 can only take a bounded limit of 3d values, Is there a way to represent it as a line in 3d? and do a intersection test to find the closest matching ball in image 2?

Or is there a way to generate a sparse list of 3d values which correspond to each 2d grid of pixels in image 1 and 2, and do a intersection test of these values to find the matching objects across two cameras?

Because the objects can be identical, OpenCV feature matching algorithms like FLANN, ORB doesn't work.

Any ideas in the form of formulae or code is welcome.

Thanks! Sak

enter image description here

enter image description here

2

There are 2 answers

2
SKG On

For different types of objects it's easy- to find the match using sum-of-absolute-differences. For similar objects, the idea(s) could lead to publish a good paper. Anyway here's one quick algorithm:

  1. detect the two balls in first image (using object detection methods).
  2. divide the image into two segments cantaining two balls.
  3. repeat steps 1 & 2 for second image also.
  4. the direction of segments in two images should give correspondence of the two balls.

Try this, it should work for two balls.

0
Sneftel On

You've set yourself quite a difficult task. Because one point can occlude another in a view, it's not generally possible even to count the number of points. If each view has two points, but those points fall on the same epipolar line on the other view, then you can count anywhere between 2 and 4 points.

Assuming you want to minimize the points, this starts to look like Minimum Vertex Cover in a dense bipartite graph, with each edge representing the association of a point from each view, and the weight of each edge taken from the registration error of associating the corresponding points (vertices) from each view. MVC is, of course, NP-hard, and if you treat the problem as a general MVC problem then you'll never do better than O(n^2) because that's how many edges there are to examine.

Your particular MVC problem might have structure that can be exploited to perform a more efficient approximation. In particular, I might suggest calculating the epipolar lines in one view, ordering them by angle from the epipole, and similarly sorting the points in that view from the epipole. You can then iterate over the two sorted lists roughly in parallel, greedily associating each point with a nearby epipolar line. Then you can do the same in the other view, but only looking at points in that view which had not yet been associated during the previous pass. I think that a more regimented and provably optimal approach might be possible with dynamic programming (particularly if you strictly bound the registration error) which wouldn't require the second pass, but I can't sketch it out offhand.