How to return a dictionary as a function's return value running as a subprocess to its parent process?

1.5k views Asked by At

I have two scripts parent.py and child.py The parent.py calls child.py as a subprocess. Child.py has a function that collects certain result in a dictionary and i wish to return that dictionary back to the parent process. I have tried by printing that dictionary from child.py onto its STDOUT so that the parent process can read it but then thats not helping me as the dictionary's content are being read as strings on seperate lines by the parent.

Moreover , as suggested in comments , i tried serializing the dictionary using JSON while printing it on stdout and also read it back from the parent using JSON , that works fine, but i also am printing a lot of other information from the child to its stdout which is eventually also being read by the parent and is mixing things up .

Another suggestion that came up was by writing the result from the child to a file in the directory and make the parent read from that file. That would work too , but i would be running 100s of instances of this code in Celery and hence it would lead to overwrites on that same file by other instances of the child.

My question is since we have a PIPE connecting the two processes how can i just write my dictionary directly into the PIPE from child.py and get it read from the parent.py

# parent.py

import subprocess

proc = subprocess.Popen(['python3', 'child.py'],
                        stdin=subprocess.PIPE,
                        stdout = subprocess.PIPE
                        )
proc.comunicate()
result = proc.stdout
#child.py

def child_function():
    result = {}
    result[1] = "one"
    result[2] = "two"
    print(result)
    #return result
    
if __name__ == "__main__":
    child_function()
3

There are 3 answers

0
tripleee On

A subprocess running Python is in no way different from a subprocess running something else. Python doesn't know or care that the other program is also a Python program; they have no access to each other's variables, memory, running state, or other internals. Simply imagine that the subprocess is a monolithic binary. The only ways you can communicate with it is to send and receive bytes (which can be strings, if you agree on a character encoding) and signals (so you can kill your subprocess, or raise some other signal which it can trap and handle -- like a timer; you get exactly one bit of information when the timer expires, and what you do with that bit is up to the receiver of the signal).

To "serialize" information means to encode it in a way which lets the recipient deserialize it. JSON is a good example; you can transfer a structure consisting of a (possibly nested structure of) dictionary or list as text, and the recipient will know how to map that stream of bytes into the same structure.

When both sender and receiver are running the same Python version, you could also use pickles; pickle is a native Python format which allows you to transfer a richer structure. But if your needs are modest, I'd simply go with JSON.

parent.py:

import subprocess
import json

# Prefer subprocess.run() over bare-bones Popen()
proc = subprocess.run(['python3', 'child.py'],
    check=True, capture_output=True, text=True)
result = json.loads(proc.stdout)

child.py:

import json
import logging

def child_function():
    result = {}
    result[1] = "one"
    result[2] = "two"
    loggging.info('Some unrelated output which should not go into the JSON')
    print(json.dumps(result))
    #return result
    
if __name__ == "__main__":
    logging.basicConfig(level=logging.WARNING)
    child_function()

To avoid mixing JSON with other output, print the other output to standard error instead of standard output (or figure out a way to embed it into the JSON after all). The logging module is a convenient way to do that, with the added bonus that you can turn it off easily, partially or entirely (the above example demonstrates logging which is turned off via logging.basicConfig because it only selects printing of messages of priority WARNING or higher, which excludes INFO). The parent will get these messages in proc.stderr.

0
John Zwinck On

Have the parent create a FIFO (named pipe) for the child:

with os.mkfifo(mypipe) as pipe:
    proc = subprocess.Popen(['python3', 'child.py', 'mypipe'],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    print(pipe.read())

Now the child can do this:

pipe_path = # get from argv
with open(pipe_path, 'w') as pipe:
    pipe.write(str(result))

This keeps your communication separate from stdin/stdout/stderr.

3
Booboo On

You can get the results via a file.

parent.py:

import tempfile
import os
import subprocess
import json


fd, temp_file_name = tempfile.mkstemp() # create temporary file
os.close(fd) # close the file
proc = subprocess.Popen(['python3', 'child.py', temp_file_name]) # pass file_name
proc.communicate()
with open(temp_file_name) as fp:
    result = json.load(fp) # get dictionary from here
os.unlink(temp_file_name) # no longer need this file

child.py:

import sys
import json


def child_function(temp_file_name):
    result = {}
    result[1] = "one"
    result[2] = "two"
    with open(temp_file_name, 'w') as fp:
        json.dump(result, fp)

    
if __name__ == "__main__":
    child_function(sys.argv[1]) # pass the file name argument