I was trying to understand FIFOs using Python under linux and I found a strange behavior i don't understand.
The following is fifoserver.py
import sys
import time
def readline(f):
s = f.readline()
while s == "":
time.sleep(0.0001)
s = f.readline()
return s
while True:
f = open(sys.argv[1], "r")
x = float(readline(f))
g = open(sys.argv[2], "w")
g.write(str(x**2) + "\n")
g.close()
f.close()
sys.stdout.write("Processed " + repr(x) + "\n")
and this is fifoclient.py
import sys
import time
def readline(f):
s = f.readline()
while s == "":
time.sleep(0.0001)
s = f.readline()
return s
def req(x):
f = open("input", "w")
f.write(str(x) + "\n")
f.flush()
g = open("output", "r")
result = float(readline(g))
g.close()
f.close()
return result
for i in range(100000):
sys.stdout.write("%i, %s\n" % (i, i*i == req(i)))
I also created two FIFOs using mkfifo input
and mkfifo output
.
What I don't understand is why when I run the server (with python fifoserver.py input output
) and the client (with python fifoclient.py
) from two consoles after some requests the client crashes with a "broken pipe" error on f.flush()
. Note that before crashing I've seen from a few hundreds to several thousands correctly processed requests running fine.
What is the problem in my code?
As other comments have alluded to, you have a race condition.
I suspect that in the failing case, the server gets suspended after one of these lines:
The client is then able to read the result, print it to the screen, and loop back. It then reopens
f
- which succeeds, because it's still open on the server side - and writes the message. Meanwhile, the server has managed to closef
. Next, the flush on the client side executes awrite()
syscall on the pipe, which triggers theSIGPIPE
because it's now closed on the other side.If I'm correct, you should be able to fix it by moving the server's
f.close()
to be above theg.write(...)
.