Detection of different shape's dynamically like ( Circle, square and Rectangle ) from the camera?

18.1k views Asked by At

I want to create an application to detect the shape of the objects like ( circle, square and rectangle only geometry shapes ) that should not be using Marker less or Edge based way to detect the shape in augmentation.

I have used the following things for this like gone through the procedures of the tutorial that are already existing there in the metaio sdk

1) Metaio : http://dev.metaio.com/sdk/tutorials/hello-world/

2) OpenCV : http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html#canny-detector

these are the thing i have tried to implement.

Geometry shapes: 1) Circle in realtime could be any circular object--> enter image description here

2) Square in realtime could be any square object--> enter image description here

3) Rectangle in realtime could be any rectangle object--> enter image description here

How can i achieve this scenario of the augmentation.

Thanks in advance

1

There are 1 answers

1
Stephan Branczyk On BEST ANSWER

Update: This StackOverflow post (with some nice sample pictures included) seems to have solved the circles detection-part of your problem at least. The reference of the excellent write-up he's pointing to can be found on this wiki page (only through the wayback machine unfortunately).

In case that new link doesn't hold either, here is the relevant section:

Detecting Images:

There are a few fiddly bits that need to taken care of to detect circles in an image. Before you process an image with cvHoughCircles - the function for circle detection, you may wish to first convert it into a gray image and smooth it. Following is the general procedure of the functions you need to use with examples of their usage.

Create Image

Supposing you have an initial image for processing called 'img', first you want to create an image variable called 'gray' with the same dimensions as img using cvCreateImage.

IplImage* gray = cvCreateImage( cvGetSize(img), 8, 1 ); 
                 // allocate a 1 channel byte image

CvMemStorage* storage = cvCreateMemStorage(0);


IplImage* cvCreateImage(CvSize size, int depth, int channels);

  size:  cvSize(width,height);

  depth: pixel depth in bits: IPL_DEPTH_8U, IPL_DEPTH_8S, IPL_DEPTH_16U,
    IPL_DEPTH_16S, IPL_DEPTH_32S, IPL_DEPTH_32F, IPL_DEPTH_64F

  channels: Number of channels per pixel. Can be 1, 2, 3 or 4. The channels 
    are interleaved. The usual data layout of a color image is
    b0 g0 r0 b1 g1 r1 ...

Convert to Gray

Now you need to convert it to gray using cvCvtColor which converts between colour spaces.

cvCvtColor( img, gray, CV_BGR2GRAY );

cvCvtColor(src,dst,code); // src -> dst

  code    = CV_<X>2<Y>
  <X>/<Y> = RGB, BGR, GRAY, HSV, YCrCb, XYZ, Lab, Luv, HLS

e.g.: CV_BGR2GRAY, CV_BGR2HSV, CV_BGR2Lab

Smooth Image

This is done so as to prevent a lot of false circles from being detected. You might need to play around with the last two parameters, noting that they need to multiply to an odd number.

cvSmooth( gray, gray, CV_GAUSSIAN, 9, 9 ); 
// smooth it, otherwise a lot of false circles may be detected

void cvSmooth( const CvArr* src, CvArr* dst,
               int smoothtype=CV_GAUSSIAN,
               int param1, int param2);

src

  • The source image.

dst

  • The destination image.

smoothtype

Type of the smoothing:

  • CV_BLUR_NO_SCALE (simple blur with no scaling) - summation over a pixel param1×param2 neighborhood. If the neighborhood size is not fixed, one may use cvIntegral function.
  • CV_BLUR (simple blur) - summation over a pixel param1×param2 neighborhood with subsequent scaling by 1/(param1•param2).
  • CV_GAUSSIAN (gaussian blur) - convolving image with param1×param2 Gaussian.
  • CV_MEDIAN (median blur) - finding median of param1×param1 neighborhood (i.e. the neighborhood is square).
  • CV_BILATERAL (bilateral filter) - applying bilateral 3x3 filtering with color sigma=param1 and space sigma=param2

param1

  • The first parameter of smoothing operation.

param2

  • The second parameter of smoothing operation.

In case of simple scaled/non-scaled and Gaussian blur if param2 is zero, it is set to param1

Detect using Hough Circle

The function cvHoughCircles is used to detect circles on the gray image. Again the last two parameters might need to be fiddled around with.

CvSeq* circles = 
 cvHoughCircles( gray, storage, CV_HOUGH_GRADIENT, 2, gray->height/4, 200, 100 );


CvSeq* cvHoughCircles( CvArr* image, void* circle_storage,
                       int method, double dp, double min_dist,
                       double param1=100, double param2=100,
                       int min_radius=0, int max_radius=0 );

======= End of relevant section =========

The rest of that wiki page is actually very good (although, I'm not going to recopy it here since the rest is off-topic to the original question and StackOverflow has a size limit for answers). Hopefully, that link to the cached copy on the Wayback machine will keep on working indefinitely.

Previous Answer Before my Update:

Great! Now that you posted some examples, I can see that you're not only after rectangles, square rectangles, and circles, you also want to find those shapes in a 3D environment, thus potentially hunting for special cases of parallelograms and ovals that from video frame to video frame can eventually reveal themselves to be rectangles, squares, and/or circles (depending on how you pan the camera).

Personally, I find it easier to work through a problem myself than trying to understand how to use an existing (often times very mature) library. This is not to say that my own work will be better than a mature library, it certainly won't be. It's just that once I can work myself through a problem, then it becomes easier for me to understand and use a library (the library itself which will often run much faster and smarter than my own solution).

So the next step I would take is to change the color space of the bitmap into grayscale. A color bitmap, I have trouble understanding and I have trouble manipulating, especially since there are so many different ways it can be represented, but a grayscale bitmap, that's both much easier to understand and manipulate. For a grayscale bitmap, just imagine a grid of values, with each value representing a different light intensity.

And for now, let's limit the scope of the problem to finding parallelograms and ovals inside a static 2D environment (we'll worry about processing 3D environments and moving video frames later, or should I say, you'll worry about that part yourself since that problem is already becoming too complicated for me).

And for now also, let's not worry about what tool or language you use. Just use whatever is easiest and most expeditive. For instance, just about anything can be scripted to automatically convert an image to grayscale assuming time is no issue. ImageMagick, Gimp, Marvin, Processing, Python, Ruby, Java, etc.

And with any of those tools, it should be easy to group pixels with similar enough intensities (to make the calculations more manageable) and to sort each pixel coordinates in a different array for each light intensity bucket. In other words, it shouldn't be too difficult to arrange some sort of crude histogram of arrays sorted by intensity that contain each pixel's x and y positions.

After that, the problem becomes a problem more like this one (which can be found on StackOverflow) and thus can be worked upon with its suggested solution.

And once you're able to work through the problem in that way, then converting the solution you come up with to a better language suited for the task shouldn't be too difficult. And it should be much easier also to understand and use the underlying function of any existing library you end choosing for the task as well. At least, that's what I'm hoping for, since I'm not familiar enough and I can't really help you with the OpenCV libraries themselves.