implementing object detection like openCV

Question

implementing object detection like openCV

843 views Asked by Human At 13 September 2017 at 13:33

I'm trying to implement the Viola-Jones algorithm for object detection using Haar cascades (like openCV's implementation) in C, to detect faces. I writing the C code in a Vivado HLS compatible way, so I can port the the implementation to an FPGA. My main goal is to learn as much as possible, rather than just getting it to work. I would also appreciate any help with improving my question.

I basically started reading G. Bradski's Learning openCV, watched some online tutorials and got started writing the code. Sure enough its not detecting faces and I don't know why. At this point I care more about understanding my mistakes rather than beeing able to detect faces.

My Implementation Steps

I'm not sure how much detail is appropriate, but to keep it short:

Extracting Haar cascade data from haarcascade_frontalface_default.xml to C readable structures (huge arrays)
Writing a function to create an integral image of any given 8bit greyscale image of size 24x24 (same size as listed in the cascade)
Applying knowledge from this great post to make the necessary calculations

My Testing Scheme

Implementing a python script to detect faces using the openCV library with the same Haar cascade as mentioned above to create golden data, a detected face is cut out (ensuring 24x24 size) from the image and stored.
Stored images are converted to one dimensional C arrays, containing pixel values row-wise: img = {row0col0, row0col1, row1col0, row1col1, ... }
integral image is calculated and face detection applied

Result

Faces pass only 6 from 25 stages of the Haar cascade and are therefore not detected by my implementation, where I know they should have been detected since the python script with openCV and the same Haar cascade did indeed detect them.

My Code

 /*
 * This is detectFace.c
 */

#include <stdio.h>
#include "detectFace.h"

// define constants based on Haar cascade in use
// Each feature is made of max 3 rects
//#define FEAT_NO 1     // max no. of features (= 2912 for face_default.xml)
#define RECTS_IN_FEAT 3 // max no. of rect's per feature
//#define INTS_IN_RECT 5    // no. of int's needed to describe a rect
// each node has one feature (bijective relation) and three doubles
#define STAGE_NO 25 // no. of stages
#define NODE_NO 211 // no of nodes per stage, corresponds to FEAT_NO since each Node has always one feature in haarcascade_frontalface_default.xml
//#define ELMNT_IN_NODE 3   // no. of doubles needed to describe a node

// constants for frame size
#define WIN_WIDTH 24 // width = height =24

//int detectFace(int features[FEAT_NO][RECTS_IN_FEAT][INTS_IN_RECT], double stages[STAGE_NO][NODE_NO][ELMNT_IN_NODE], double stageThresh[STAGE_NO], int ii[24][24]){
int detectFace(
    int ii[576],
    int stageNum,
    int stageOrga[25],
    float stageThresholds[25],
    float nodes[8739],
    int featOrga[2913],
    int rectangles[6383][5])
{
    int passedStages = 0; // number of stages passed in this run
    int faceDetected = 0; // turns to 1 if face is detected and to 0 if its not detected
    // Debug:
    int nodesUsed = 0; // number of floats out of nodes[] processed, use to skip to the unprocessed floats
    int rectsUsed = 0; // number of rects processed
    int droppedInStage0 = 0;

    // loop through all stages
    int i;
detectFace_label1:
    for (i = 0; i < STAGE_NO; i++)
    {
        double tmp = 0.0;           //variable to accumulate node-values, to then compare to stage threshold
        int nodeNum = stageOrga[i]; // get number of nodes for this stage from stageOrga using stage index i
        // loop through nodes inside each stage
        // NOTE: it is assumed that each node maps to one corresponding feature. Ex: node[0] has feat[0) and node[1] has feat[1]
        // because this is how it is written in the haarcascade_frontalface_default.xml
        int j;
    detectFace_label0:
        for (j = 0; j < NODE_NO; j++)
        {
            // a node is defined by 3 values:
            double nodeThresh = nodes[nodesUsed]; // the first value is the node threshold
            double lValue = nodes[nodesUsed + 1]; // the second value is the left value
            double rValue = nodes[nodesUsed + 2]; // the third value is the right value
            int sum = 0;                          // contains the weighted value of rectangles in one Haar feature
            // loop through rect's in a feature, some have 2 and some have 3 rect's.
            // Each node always refers to one feature in a way that node0 maps to feature0 and node1 to feature1 (The XML file is build like that)
            //int rectNum = featOrga[j]; // get number of rects for current feature using current node index j
            int k;
        detectFace_label2:
            for (k = 0; k < RECTS_IN_FEAT; k++)
            {
                int x = 0, y = 0, width = 0, height = 0, weight = 0, coordUpL = 0, coordUpR = 0, coordDownL = 0, coordDownR = 0;

                // a rect is defined by 5 values:
                x = rectangles[rectsUsed][0];      // the first value is the x coordinate of the top left corner pixel
                y = rectangles[rectsUsed][1];      // the second value is the y coordinate of the top left corner pixel
                width = rectangles[rectsUsed][2];  // the third value is the width of the current rectangle
                height = rectangles[rectsUsed][3]; // the fourth value is the height of this rectangle
                weight = rectangles[rectsUsed][4]; // the fifth value is the weight of this rectangle

                // calculating 1-Dim index for points of interest. Formula: index = width * row + column, assuming values are stored in row order
                coordUpL = ((WIN_WIDTH * y) - WIN_WIDTH) + (x - 1);
                coordUpR = coordUpL + width;
                coordDownL = coordUpL + (height * WIN_WIDTH);
                coordDownR = coordDownL + width;

                // calculate the area sum according to Viola-Jones
                //sum += (ii[x][y] + ii[x+width][y+height] - ii[x][y+height] - ii[x+width][y]) * weight;
                sum += (ii[coordUpL] + ii[coordDownR] - ii[coordUpR] - ii[coordDownL]) * weight;
                // Debug: counting the number of actual rectangles used
                rectsUsed++; //
            }
            // decide whether the result of the feature calculation reaches the node threshold
            if (sum < nodeThresh)
            {
                tmp += lValue; // add left value to tmp if node threshold was not reached
            }
            else
            {
                tmp += rValue; // // add right value to tmp if node threshold was reached
            }
            nodesUsed = nodesUsed + 3; // one node is processed, increase nodesUsed by number of floats needed to represent a node (3)¬
        }
        //########  at this point we went through each node in the current stage #######
        // check if threshold of current stage was reached
        if (tmp < stageThresholds[i])
        {
            faceDetected = 0; // if any stage threshold is not reached the operation is done and no face is present
            // Debug: show in which stage the frame was dropped
            printf("Face detection failed in stage %d \n", i);
            //i = stageNum;         // breaks out this loop, because i is supposed to stay smaller than STAGE_NO
        }
        else
        {
            passedStages++; // stage threshold is reached, therefore passedStages will count up
        }
    }
    //########  at this point we went through all stages ###############################
    //----------------------------------------------------------------------------------
    // if the number of passed stages reaches the total number of stages, a face is detected
    if (passedStages == stageNum)
    {
        faceDetected = 1; // one symbolizes that the input is a face
    }
    else
    {
        faceDetected = 0; // zero symbolizes that the input is not a face
    };
    return faceDetected;
}

Original Q&A

TechQA.

implementing object detection like openCV

My Implementation Steps

My Testing Scheme

Result

My Code

There are 0 answers

Related Questions in C

Related Questions in OPENCV

Related Questions in HAAR-CLASSIFIER

Related Questions in VIVADO-HLS

Popular Questions

Popular Tags

Trending Questions