Best algorithm for video stabilization

4.2k views Asked by At

I am creating a program to stabilize the video stream. At the moment, my program works based on the phase correlation algorithm. I'm calculating an offset between two images - base and current. Next I correct the current image according to the new coordinates. This program works, but the result is not satisfactory. The related links you may find that the treated video appears undesirable and shake the whole video is becoming worse.
Orininal video
Unshaked video
There is my current realisation:
Calculating offset between images:

Point2d calculate_offset_phase_optimized(Mat one, Mat& two) {

  if(two.type() != CV_64F) {
    cvtColor(two, two, CV_BGR2GRAY);
    two.convertTo(two, CV_64F);
  }

  cvtColor(one, one, CV_BGR2GRAY);
  one.convertTo(one, CV_64F);

  return phaseCorrelate(one, two);

}

Shifting image according this coordinate:

void move_image_roi_alt(Mat& img, Mat& trans, const Point2d& offset) {

  trans = Mat::zeros(img.size(), img.type());
  img(
    Rect(
        _0(static_cast<int>(offset.x)),
        _0(static_cast<int>(offset.y)),
        img.cols-abs(static_cast<int>(offset.x)),
        img.rows-abs(static_cast<int>(offset.y))
    )
  ).copyTo(trans(
    Rect(
        _0ia(static_cast<int>(offset.x)),
        _0ia(static_cast<int>(offset.y)),
        img.cols-abs(static_cast<int>(offset.x)), 
        img.rows-abs(static_cast<int>(offset.y))
    )   
  )); 
}

int _0(const int x) {
  return x < 0 ? 0 : x;
}

int _0ia(const int x) {
  return x < 0 ? abs(x) : 0;
}

I was looking through the document authors stabilizer YouTube and algorithm based on corner detection seemed attractive, but I'm not entirely clear how it works. So my question is how to effectively solve this problem. One of the conditions - the program will run on slower computers, so heavy algorithms may not be suitable.
Thanks!
P.S. I apologize for any mistakes in the text - it is an automatic translation.

2

There are 2 answers

8
the swine On BEST ANSWER

You can use image descriptors such as SIFT in each frame and calculate robust matches between the frames. Then you can calculate homography between the frames and use that to align them. Using sparse features can lead to faster implementation than using a dense correlation.

Alternately, if you know the camera parameters you can calculate 3D positions of the points and of the cameras and reproject the images onto a stable projection plane. In the result, you also get a sparse 3D reconstruction of the scene (somewhat imprecise, usually it needs to be optimized to be usable). This is what e.g. Autostitch would do, but it is quite difficult to implement, however.

Note that the camera parameters can also be calculated, but that is even more difficult.

0
Vit On

OpenCV can do it for you in 3 lines of code (it is definitely shortest way, may be even the best):

t = estimateRigidTransform(newFrame, referenceFrame, 0); // 0 means not all transformations (5 of 6)
if(!t.empty()){    
    warpAffine(newFrame, stableFrame, t, Size(newFrame.cols, newFrame.rows)); // stableFrame should be stable now
}

You can turn off some kind of transformations by modifying matrix t, it can lead to more stable result. It is just core idea, then you can modify it in the way you want: change referenceFrame, smooth set of transformation parameters from matrix t etc.