I am currently working on a people detection and counting project. It basically detect for any people in the scene via USB webcam, then count people passingby. Currently, my setup is:
- OpenCV 2.4.6, detect people head using Haar method (floating point processing)
- ARM board with ARM A9 quad core and Mali quad core GPU
Unfortunately, the processing time is not fast enough 70 - 100 ms per frame (14 - 10 fps) so that people walking in normal speed or faster is not counted. The bottleneck is in the OpenCV HaarDetection method, basically 90% of the processing time per frame is consumed by the process.
I tried using another model beside Haar, the LBP model which is based on integer processing, but so far my LBP model is not satisfying and I am still working on to create new models. Also, I tried using TBB with OpenCV (multithreading natively implemented in OpenCV) but somehow cause crash in Odroid, the application works stable if I do not use TBB.
The only optimization I can think of is to utilize the Mali GPU in the board, recompiling OpenCV with modified HaarDetection to utilize some GPU processing power. My question is, is this doable using the OpenGL library? I see most example of OpenGL is to render graphic, not processing images.
Other optimizations which you may consider:
1. Play with parameters - even small changes of scale factor and minimum windows size can make your algorithm faster.
2. Try to use different cascade
3. Try to play with OpenCV building parameters - WITH_TBB might help you (http://www.threadingbuildingblocks.org/) if you processor support multithreading and cascade can use more than one thread(i think that it's possible - maybe not all the time, but at least some parts of it can). Take a look at ENABLE_SSE and ENABLE_SSE2 as well.
4. Search for some other implementations of haar cascade detector or try to make it on your own - it's possible to make it faster, see(article and comments): http://www.computer-vision-software.com/blog/2009/06/fastfurious-face-detection-with-opencv/
5. If you are analysing image sequences check whether two consecutive frames are the same/very similar - if so you can skip analysis of current frame, because results will be the same(or very similar). I've used this solution in my BSc thesis(simple eyetracker using 720p webcam) and it worked fine.
6. As above + search only in regions in which difference occurs.
7. Divide your image on for example 16 rectangles. Check differences between current and previous frame in each rectangle - if all rectangles from one row or column are almost the same as in previous frame - don't analyse this row/column(pass only part of your image to haar cascade - use ROI). It should give quite good results and increase speed, because people will walk/run/etc from one side of frame to another - there is small chance that all rectangles will change between two consecutive frames.