Quickly recognize different scanned forms with OpenCV and find homography transformation

269 views Asked by At

I have some image files of forms, that will be filled, printed and then scanned. The forms are of 5 different kinds (but possibly more), and I know the coordinates of the fields in original images.

For each scanned document I have to find out which kind of form it is and then calculate the homography transformation to translate the specified fields coordinates.

My problem is to do this procedure as fast as possible while keeping low the rate of wrong recognitions; also I don't know as much as I want of image analysis.

I would like to build a "database" of original images keypoints, and then, for each scanned image, extrapolate keypoints, calculate transformation and so on. Which is the right approach with OpenCV? By now, I've managed to achieve the second part, from homography calculation to field coordinate transformation; here, more or less, my bad implementation of the first part:

void generateDataDb()
{
    cv::Ptr<cv::xfeatures2d::SURF> detector = cv::xfeatures2d::SURF::create();

    const std::string dstPath{ "./data_db.xml" };
    cv::FileStorage storage{ dstPath, cv::FileStorage::WRITE };

    for(size_t i=1; i<=5; ++i) {
        std::string srcPath{ std::string{"./templates/"}+
                             std::to_string(i)+
                             std::string{".png"} };
        cv::Mat modelImage { cv::imread(srcPath, CV_LOAD_IMAGE_GRAYSCALE) };

        if(!modelImage.empty()){
            std::vector<cv::KeyPoint> modelKeypoints;
            cv::Mat modelDescriptors;
            detector->detectAndCompute(modelImage,
                                       cv::Mat(),
                                       modelKeypoints,
                                       modelDescriptors);

            std::string keypointsId{ std::to_string(i)+
                                     std::string{"-keypoints"} };
            storage<<keypointsId<<modelKeypoints;
            std::string descriptorsId{ std::to_string(i)+
                                       std::string{"-descriptors"} };
            storage<<descriptorsId<<modelDescriptors;
        }
    }
    storage.release();
}

cv::Mat analyzeDocument(std::string srcPath)
{
    cv::Mat scannedImage { cv::imread(srcPath, CV_LOAD_IMAGE_GRAYSCALE) };
    if(modelImage.empty()){
        return {};
    }

    cv::Ptr<cv::xfeatures2d::SURF> detector = cv::xfeatures2d::SURF::create();
    std::vector<cv::KeyPoint> scannedKeypoints;
    cv::Mat scannedDescriptors;
    detector->detectAndCompute(scannedImage,
                               cv::Mat(),
                               scannedKeypoints,
                               scannedDescriptors);

    const std::string dbPath{ "./data_db.xml" };
    cv::FileStorage storage{ dbPath, cv::FileStorage::READ };
    size_t formKind=0;  // Excuse me for the bad datatype,
                        // in my code there are enums!

    // Missing code that should set formKind to the right model index.

    if(formKind==0){
        return {};
    }

    std::string keypointsId{ std::to_string(formKind)+
                             std::string{"-keypoints"} };
    std::vector<cv::KeyPoint> modelKeypoints;
    storage[keypointsId]>>modelKeypoints;
    std::string descriptorsId{ std::to_string(formKind)+
                               std::string{"-descriptors"} };
    cv::Mat modelDescriptors;
    storage[descriptorsId]>>modelDescriptors;

    cv::FlannBasedMatcher matcher;
    std::vector< cv::DMatch > matches;
    matcher.match( modelDescriptors, scannedDescriptors, matches );
    std::vector<cv::Point2f> modelPoints, scanPoints;
    for( size_t i = 0; i < matches.size(); i++ ) {
        modelPoints.push_back( modelKeypoints[ matches[i].queryIdx ].pt );
        scanPoints.push_back( scanKeypoints[ matches[i].trainIdx ].pt );
    }
    return cv::findHomography( modelPoints, scanPoints, cv::RANSAC );
}

I know that there are better ideas for the database, but I've too many doubts about which one is the right one for me.

Also, SURF, FlannBasedMatcher, DMatch, RANSAC, I don't understand strengths and weaknesses of these choices, so maybe I'm all wrong.

Any suggestion is appreciated, thanks!

0

There are 0 answers