Computer vision over cloud

436 views Asked by At

Is any way of doing computer vision over the cloud? The idea is like people log in a website, then the webcam is activated, the video data is sent to the server through internet. Server processes those data and sent back the processed data to user in real time or 10 frame per second at least.

Is this doable? What kind of skills do we need on the network side? I know video streaming is one component. Also, How can we set up the server? Distributed system can help or not considering very large computation in limited time?

3

There are 3 answers

0
samfr On

This will only be worth it if

1) you can compress your image data or features enough to be viable with whatever bandwidth the user has

2) the computations you are doing are big/complex enough that they are not doable in the browser

If you determine that both of these are true then the easiest thing might be to look into sending your features, or image, via websockets to a server that is ready to classify them or do whatever you processing you need. Maybe look at the tornado websocket framework for python, then you could integrate with the python OpenCV bindings without too much trouble. Based on he info you have given, it is hard for me to say much more.

Whether or not a distributed system will help depends on what you intend to do (what CV algo), but it most likely will if you have the capability of implementing one.

I would encourage you to look at javascript solutions in the browser, because network latency will be a big issue.

0
BostonCVGuy On

See http://vision.ai/. They are running a kickstarter that a thin client computer vision application where the computer vision happens on a remote server. The have object detectors, trackers, and other widgets and methods for training these capabilities. Fund them if you want to see it happen.

0
vbence On

The different scale-space detection levels can run in parallel, also the database you compare your images against can be distributed over a number of servers.

As I understand you want to create a kind of augmented reality. I can not answer with a clear yes or no if it can be done with current mobile cpu power and bandwidth.

I would start by implementing a very rudamentary feature detection on the client side, then sending still pictures to the server (high resolution is the key). The server can process the image with large computing power and check the objects against the database. Then send back the result.

The client then can connect its very basic feature detection with the server's response and this way create a real-time "labelled" video. The server has to be called when the client detects that new image data is available (the user turns the phone in a different direction).