Peeking stdout of subprocess.Popen objects does not behave correctly, am I missing something?

1.9k views Asked by At

To be exact, it does not update until everything it contains has been read(but only if the stream has been read at least once), which makes it effectively dysfunctional.

Pardon the weird example, but I'm presently trying to write a simple graphical ping monitor:

import tkinter as tk

from subprocess import Popen, PIPE, STDOUT
import shlex, re
from sys import stdout, platform

class Ping(object):
    def __init__(self):
        if platform == "win32":
            command = shlex.split("ping -w 999 -t 8.8.8.8")
        elif platform == "linux" or platform == "osx":
            command = shlex.split("ping -W 1 8.8.8.8")
        self.ping = Popen(command, stdout=PIPE, stderr=STDOUT, shell=True)
        self.ping.stdout.readline()
        self.ping.stdout.readline()
    def get_next_ping(self):
        has_line = str.find(self.ping.stdout.peek().decode("ascii", "ignore"), "\n") != -1
        if not has_line:
            print(self.ping.stdout.peek()) # Debug statement
            return None
        else:
            line = self.ping.stdout.readline().decode("ascii", "ignore")
            print(line) # Debug statement
            try: return int(float(re.findall("([0-9]+)[^m]?ms", line)[0]))
            except IndexError: return -1

class App(tk.Tk):
    def __init__(self):
        tk.Tk.__init__(self)
        self.pingmon = Ping()
        self.bind("<ButtonPress-1>", self.check_buffer)

    def check_buffer(self, event):
        print(self.pingmon.get_next_ping())
app=App()
app.mainloop()

In this example, when you click, the subprocess is polled to see if a new line(containing output ping, or timeout message) is available. If you run the project and begin clicking immediately, you will notice that output of peek() has stopped updating and is always b'Reply from 8.8.8.8: '.

I have also tried an alternative method of checking the length of peek's output, but it is apparently never equal to zero, so that is worthless as well.

Further, I attempted to invoke flush() method of the stream, but it does not appear to in any way help the situation either

The final result is that subprocess.Popen.stdout.peek() appears to be dysfunctional, and not usable for its intended purpose of peeking into the output bufer, but Python is a mature language, and I would not expect to find this kind of bug in it, is there anything I am missing? If not, how can I work around this issue?

2

There are 2 answers

1
DoDoSmarts On

@Llamageddon I think the filepointer needs to be moved to refresh the buffer in the if not has_line check using a readline(). Peek doesn't advance the pointer so you've essentially got a bug that will keep "peaking" at an empty filebuffer.

if not has_line:
    print(self.ping.stdout.peek()) # Debug statement

    self.ping.stdout.readline() # Should refresh the filebuffer.

    return None

re: peek() can be used to look at large file buffers and perhaps is not a good use for your work considering the response size; however, I think a good example when peek() is not "dysfunctional, and not usable" :) is when the line in the buffer is 100,000 characters long and looking at the first 100 characters will be sufficient to evaluate what to do with the line (i.e. skip it or apply additional logic). Peak would allow us to perform the look and evaluate all while minimizing the block time.

6
ShmulikA On

Answer

just use readline() method. if no line exists it returns empty bytes object - b''

example usage of readline():

from subprocess import Popen, PIPE, STDOUT

curdir = Popen(['pwd'], stdout=PIPE, stderr=STDOUT)
print(curdir.stdout.readline())
print(curdir.stdout.readline())
print(curdir.stdout.readline())

this will output (on python3):

b'/home/shmulik\n'
b''
b''

for your case this is updated get_next_ping() func (also changed the regex a bit)

def get_next_ping(self):
    line = self.ping.stdout.readline()
    if not line:
        return

    line = line.decode('utf-8', 'ignore')
    print(line)  # Debug statement
    try:
        return int(float(re.search(r'([0-9.]+)[^m]?ms', line).group(1)))
    except (IndexError, AttributeError):
        return -1

Non-Blocking

if you care from blocking operations, please take a look at this so answer

you can use the select module on unix to read from stdout in non-blocking fashion or run background thread to update a buffer for reading.

Example using thread for buffering

class Ping(object):
    def __init__(self):
        if platform == "win32":
            command = shlex.split("ping -w 999 -t 8.8.8.8")
        elif platform == "linux" or platform == "osx":
            command = shlex.split("ping -W 1 8.8.8.8")
        self.ping = Popen(command, stdout=PIPE, stderr=STDOUT, shell=True)
        self.ping.stdout.readline()
        self.ping.stdout.readline()

        self.lines = []  # lines will be added here by the background thread
        self.lines_lock = Lock()  # avoid race conditions on .pop()
        self.lines_reader_thread = Thread(target=self._readlines)  # start the background thread
        self.lines_reader_thread.daemon = True
        self.lines_reader_thread.start()

    def _readlines(self):
        line = None

        while line or line is None:
            line = self.ping.stdout.readline().decode()
            with self.lines_lock:
                self.lines.append(line)

    def get_next_ping(self):
        with self.lines_lock:
            if not self.lines:
                return
            line = self.lines.pop()

        print(line)  # Debug statement
        try:
            return int(float(re.search(r'([0-9.]+)[^m]?ms', line).group(1)))
        except (IndexError, AttributeError):
            return -1

Suggestions

  1. use an existing python lib for ping instead of parsing stdout. some libs requires to run as root under linux, this might be a limitation for you.
  2. send one ping at a time instead of long running background ping process. that way you can use subprocess.check_output().
  3. avoid using shell=True on popen(), passing unsanitized input to it may lead to command injection.