I am trying to develop an application that needs to know the location of tagged objects in an image. Knowing that there is a "piano" in an image is not enough, I need to know where that piano is in the image.
Both Microsoft's Computer Vision API and Google's Cloud Vision API provide some form of cropping suggestion/smart thumbnail generation service which leads me to think that the location of certain objects is being detected - however is there a way to get that information (like a bounding box around each detected object) from either Microsoft's Computer Vision API or Google's Cloud Vision API?
EDIT: I understand that both APIs can return the location of faces detected in an image, however I am looking for locations and sizes of every object in an image: cars, pianos, trees, people...anything.
Microsoft Vision API offer no pixel coordinates for the detected objects (see return features: https://dev.projectoxford.ai/docs/services/56f91f2d778daf23d8ec6739/operations/56f91f2e778daf14a499e1fa).
However if you want to detect persons Microsoft API can return the coordinates of the face rectangles.