I am currently working with ASUS Xtion on the extraction of the face ROI in a depth+color stream. At the moment, I am manually grabbing the frames with OpenNI2 instead of using the PCL wrappers.
After grabbing the depth map I am generating the point cloud in a cv::Mat and a pcl::PointCloud format, respectively, to run Fanelli's decision forest method twice, once with the implementation which Fanelli provides himself (http://www.vision.ee.ethz.ch/~gfanelli/head_pose/head_forest.html) and once with the PCL version of that method.
Fanelli's code performs "normal", i.e. regarding execusion time and detection rate. However, when I'm feeding my pcl::PointClouds to the PCL forest I hardly get detected heads and the execusion times are quite slow.
Has anyone else already been doing a comparison between those two implementations (of the same method!)? I am trying to figure out the differences between them, besides data representation and the existing forest that come with each method respectively (however, the forests were trained on the same DB!).
Basically, I am doing it like this:
<grab frames with OpenNI2 library>
<call processNewFrame method of a frame processing class>
withing that processNewFrame method:
<convert float[] depthmap into cv::Mat and pcl::PointCloud<pcl::PointXYZRGB> pointcloud>
CRForestEstimator->estimate(..., cv::Mat pointcloud, ...)
pcl::RFFaceDetectorTrainer->setInputCloud(PointXYZ version of pointcloud)
pcl::RFFaceDetectorTrainer->detectFaces()
That frame processing class initializes the CRForestEstimator (class has pointer as member) as well as the pcl::RFFaceDetectorTrainer (class has "normal" member variable) in its constructor according to provided sample code.