I have two scripts parent.py and child.py The parent.py calls child.py as a subprocess. Child.py has a function that collects certain result in a dictionary and i wish to return that dictionary back to the parent process. I have tried by printing that dictionary from child.py onto its STDOUT so that the parent process can read it but then thats not helping me as the dictionary's content are being read as strings on seperate lines by the parent.
Moreover , as suggested in comments , i tried serializing the dictionary using JSON while printing it on stdout and also read it back from the parent using JSON , that works fine, but i also am printing a lot of other information from the child to its stdout which is eventually also being read by the parent and is mixing things up .
Another suggestion that came up was by writing the result from the child to a file in the directory and make the parent read from that file. That would work too , but i would be running 100s of instances of this code in Celery and hence it would lead to overwrites on that same file by other instances of the child.
My question is since we have a PIPE connecting the two processes how can i just write my dictionary directly into the PIPE from child.py and get it read from the parent.py
# parent.py
import subprocess
proc = subprocess.Popen(['python3', 'child.py'],
stdin=subprocess.PIPE,
stdout = subprocess.PIPE
)
proc.comunicate()
result = proc.stdout
#child.py
def child_function():
result = {}
result[1] = "one"
result[2] = "two"
print(result)
#return result
if __name__ == "__main__":
child_function()
A subprocess running Python is in no way different from a subprocess running something else. Python doesn't know or care that the other program is also a Python program; they have no access to each other's variables, memory, running state, or other internals. Simply imagine that the subprocess is a monolithic binary. The only ways you can communicate with it is to send and receive bytes (which can be strings, if you agree on a character encoding) and signals (so you can kill your subprocess, or raise some other signal which it can trap and handle -- like a timer; you get exactly one bit of information when the timer expires, and what you do with that bit is up to the receiver of the signal).
To "serialize" information means to encode it in a way which lets the recipient deserialize it. JSON is a good example; you can transfer a structure consisting of a (possibly nested structure of) dictionary or list as text, and the recipient will know how to map that stream of bytes into the same structure.
When both sender and receiver are running the same Python version, you could also use pickles; pickle is a native Python format which allows you to transfer a richer structure. But if your needs are modest, I'd simply go with JSON.
parent.py
:child.py
:To avoid mixing JSON with other output, print the other output to standard error instead of standard output (or figure out a way to embed it into the JSON after all). The
logging
module is a convenient way to do that, with the added bonus that you can turn it off easily, partially or entirely (the above example demonstrates logging which is turned off vialogging.basicConfig
because it only selects printing of messages of priorityWARNING
or higher, which excludesINFO
). The parent will get these messages inproc.stderr
.