zmq ventilator/worker/sink paradigm not working w/ subprocess

354 views Asked by At

I am trying to replicate the ventilator/workers/sink paradigm described in the ZMQ guide. I have the same Python Ventilator, the same C++ worker as, and the same Python Sink as was described in the ZMQ examples. I want to launch the ventilator, workers, and sink from one main python script, so I created "class" wrappers around the ventilator & sink, and both of those classes subclass the Python module "multiprocessing.Process." Since the C++ is a binary, I launch it with Python's subprocess.Popen call.

The order of starting all of this up is as follows:

h = subprocess.Popen('test')  # test is the name of the binary
time.sleep(1)
s = sinkObj.start()
time.sleep(1)
v = ventObj.start()

What I am finding is that no data is getting through the system when I start up the components like this. However, if I start the C++ binary in its own shell, and only start the sinkObj and ventObj from the main python script, it works fine.

I apologize in advance if this is more of a Python question than a ZMQ question, but I haven't run into issues like this w/ Python's subprocess. I have also tried using os.system() instead of the subprocess... but same issue. I put all the code on this website: https://github.com/kkarrancsu/zmqtest if anybody is curious to test it out. There is a readme on that git which tells you what the files are.

Any ideas on why this could be happening?

------------------------- UPDATE --------------------

I found that if I create a shell script which simply launches the C binary, and call that shell script w/ os.system('run_the_shell_script') it works! So this means that there is something wrong with the way that I am using subprocess.Popen(...), but can't seem to pinpoint what the issue is. I tried w/ the shell=True flag, but it still hangs with that...

2

There are 2 answers

1
Tisho On

As I see, the ventilator and sink bind() to ports 6557 and 6558, and the C++ app connect() to these ports. In this case, if you start the cpp app first, it will try to connect() to the endpoints, but as nothing is bound there, it will drop silently. In ZeroMQ the basic principle is "First Bind, then Connect". So you should not connect() before you bind() something on the socket. Imagine bind() is the 'Server', and connect() is the client. You cannot connect client to non-existing server. Also, in ZeroMQ every socket can be 'Server', but you SHOULD HAVE only 1 bind()-ing socket per URL. And you can have multiple 'connect()'s.

0
flyer On

It's the name of the worker binary file that causes the problem.

There two solutions:

  • Chang the name of the binary file test to test_new and do the same in your All.py file, and then it will work as you desire.
  • Substitute subprocess.Popen('./test', shell=True) for subprocess.Popen('test', shell=True).

test is Linux command. If you type the following in your shell

$ echo $PATH

You may see that . is at the last position. It means that until shell couldn't find the binary file to be executed in the directories that $PATH indicates, it will try to search for it in the current directory .
When you execute subprocess.Popen('test', shell=True), it could find it before trying the . directory and so it won't execute the workers.