Machine Learning: Question regarding processing of RGBD streams and involved components

Question

Machine Learning: Question regarding processing of RGBD streams and involved components

393 views Asked by Matthias Güntert At 13 January 2019 at 18:30

I would like to experiment with machine learning (especially CNNs) on the aligned RGB and depth stream of either an Intel RealSense or an Orbbec Astra camera. My goal is to do some object recognisation and highlight/mark them in the output video stream (as a starting point).

But after having read many articles I am still confused about the involved frameworks and how the data flows from the camera through the involved software components. I just can't get a high level picture.

This is my assumption regarding the processing flow:

Sensor => Driver => libRealSense / Astra SDK => TensorFlow

Questions

Is my assumption correct regarding the processing?
Orbbec provides an additional Astra OpenNI SDK besides the Astra SDK where as Intel has wrappers (?) for OpenCV and OpenNI. When or why would I need this additional libraries/support?
What would be the quickest way to get started? I would prefer C# over C++

Original Q&A

There are 1 answers

**nessuno** · Accepted Answer · 2019-01-13T19:43:58+00:00

Your assumptions are correct: the data acquisition flow is: sensor -> driver -> camera library -> other libraries built on top of it (see OpenCV support for Intel RealSense)-> captured image. Once you got the image, you can do whatever you want of course.
The various libraries allow you to work easily with the device. In particular, OpenCV compiled with the Intel RealSense support allows you to use OpenCV standard data acquisition stream, without bothering about the image format coming from the sensor and used by the Intel library. 10/10 use these libraries, they make your life easier.
You can start from the OpenCV wrapper documentation for Intel RealSense (https://github.com/IntelRealSense/librealsense/tree/master/wrappers/opencv). Once you are able to capture the RGBD images, you can create your input pipeline for your model using tf.data and develop in tensorflow any application that uses CNNs on RGDB images (just google it and look on arxiv to have ideas about the possible applications).

Once your model has been trained, just export the trained graph and use it in inference, hence your pipeline will become: sensor -> driver -> camera library -> libs -> RGBD image -> trained model -> model output

TechQA.

Machine Learning: Question regarding processing of RGBD streams and involved components

There are 1 answers

Related Questions in OPENCV

Related Questions in TENSORFLOW

Related Questions in OPENNI

Related Questions in REALSENSE

Related Questions in ORBBEC

Popular Questions

Popular Tags

Trending Questions