Pipe Feeding Anomaly

82 views Asked by At

I have a gzipped file that I've split into 3 separate files: xaa, xab, xac. I make a fifo

mkfifo p1

and reassemble the files by reading from it, also calculating a checksum and unzipping the file in a pipe:

cat p1 p1 p1 | tee >(sha1sum > sha1sum_new.txt) | gunzip > output_file.txt

This works just fine if I feed the pipe from another terminal with

cat xaa > p1
cat xab > p1
cat xac > p1

but if I feed the pipe with a single line,

cat xaa > p1; cat xab > p1; cat xac > p1

the receiving pipeline hangs, no checksum is produced, and although an output file is produced, it is truncated - but by an amount smaller than the final file size.

Why is the behavior in the second case different from the first?

2

There are 2 answers

0
Filipe Gonçalves On BEST ANSWER

Interesting question. As the other answer mentions, you have a race condition - and I am pretty sure of that. In fact, you have a race condition in both cases, but in the former you're just lucky it doesn't happen because maybe your files are small and can be read before you enter the next command line. Allow me to explain.

So, a little bit of background first:

  1. cat opens each file you feed it as an argument sequentially, prints it to the output, and then closes the file and moves on to the next file. The exact details of whether cat opens each file sequentially or opens them all first and then writes each file sequentially may vary, but it's not relevant for the discussion. In both cases, you'll have a race condition
  2. The open(2) syscall will block on a FIFO / pipe until the other end is opened. So for example, if process pid1 opens the FIFO for reading, open(2) will block until, say, pid2 opens the FIFO for writing. In other words, opening a FIFO that has no active readers or writers implicitly synchronizes both processes and guarantees that a process will not read from a pipe that has no writer yet, or that a writer will not write to a pipe that has no reader yet. But as we will see, this will be problematic.

What's really happening

When you do this:

cat xaa > p1
cat xab > p1
cat xac > p1

Things are really slow, because humans are slow. After you enter the first line, cat opens p1 for writing. The other cat is blocked on opening it for reading (or maybe not yet, but let's assume it is). Once both cat processes open p1 - one for writing, the other for reading - data starts to flow.

And then, before you even have the chance to enter the next command line (cat xab >p1), the whole file flows through the pipe and everyone is happy - the cat reader process sees an end of file on the pipe, calls close(2), the cat writer finishes writing the file, and closes p1. The cat reader moves on to the next file (which is p1 again), opens it, and blocks because no active writers have opened the fifo yet.

Then, you, slow human, enter the next command line, which causes another cat writer process to open the FIFO, which unblocks the other cat that is waiting to open for reading, and everything happens again. And then again for the third command line.

When you put everything in one line in the shell, things happen way too fast.

Let's differentiate the 3 cat invocations. Call it cat1, cat2 and cat3:

cat1 xaa > p1; cat2 xab > p1; cat3 xac > p1

The shell executes each command sequentially, waiting for the previous command to finish before moving to the next one.

However, it might just be the case that cat1 finished writing everything to p1 and exits, the shell moves on to cat2, which opens the FIFO and starts writing the contents of p1 again, and the cat reader didn't have the chance to finish reading what cat1 wrote in the first place, and now suddenly the cat reader "thinks" it's still reading from the first file (the first p1), but at some point it starts reading the data that cat2 started pushing into the pipe (as if it was in the first p1). It has no way of knowing that the first "copy" of the data is over if cat2 is faster and opens the FIFO before the cat reader finishes reading what cat1 wrote.

Yes, subtle, but it's exactly what is happening.

Then, of course, input eventually comes to an end, and the cat reader will think that the first p1 is done and moves to the next p1, opening it and waiting for the next writer to open it. But there will never be a next writer! It blocks forever, and the whole pipeline is stalled forever.

How to fix it

The solution in the other answer solves the problem. You mentioned in the comments that it might not be enough for you because you don't control when and how a new writer opens and uses the pipe.

So I suggest this instead:

  1. Create a persistent writer process that just maintains the FIFO opened for writing, even though it will never actually write. It's just to make sure that there's no window of time where no writers are active and the reader attempts to read. To do this, just cat standard input to p1 in the background: cat >p1 &. When you're done, kill the background job.
  2. Open the pipe only once in the reader process. This can be done either with cat p1 | tee >(sha1sum ...) or using the method proposed in the other answer (tee >(...) <p1). After all, opening a FIFO once should be enough no matter how complex your system is; FIFOs by nature always give you the data in a first in first out fashion.

Keep the background cat writer running as long as you know that there's a chance of new files arriving / new writers opening the FIFO and using it. Don't forget to terminate the background job when you know that input is over.

3
chepner On

I'm not positive, but I think there is a race condition involved. Consider using this as a simpler alternative:

tee >(sha1sum > sha1sum_new.txt) < p1 | gunzip > output_file.txt

and feed p1 with a single command

cat xaa xab xac > p1

This way, you open p1 for writing exactly once, and open it for reading exactly once.