tee stdout and stderr independently, without adding jitter

173 views Asked by At

I'd like to create a command wrapper that records a log of stderr and stdout, separately, without changing the output of the command. One aspect of not changing the output is to ensure that the ordering of output on stderr vs stdout isn't changed.

I've created a synthetic worst-case scenario: (python -u puts the interpreter into fully-unbuffered mode)

$ cat outputter.py
from sys import stdout, stderr

for line in range(10):
    from itertools import izip, cycle
    for column, char, file in izip(
            range(79),
            cycle(('_', '|')),
            cycle((stdout, stderr)),
    ):
        file.write(char)
    stdout.write('\n')

$ python -u outputter.py
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_
_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_

I can keep a record of the two streams and redirect them back to their original destinations fairly easily with a little bash, but the interleaving gets destroyed:

$ python -u outputter.py > >(tee stdout.txt) 2> >(tee stderr.txt >&2)
|||_____________||||||||||_||___||_|__|||___||_|_||__||___||_||___||_|_|__||__|
__||__||_|__||_||___||_|_|_|__||__||__||_|__||_||___||__||_|_|__||_||__|__||_|_
__||__||_|__||_||___||__||_|_|_|__||_||__||___||_|__||__||__||__||_|_|___|||_|_
__|_||__||_||___||__||__||_|__||_|__||__||_||___||_|_||___||_|__||_|__||_|__||_
__|__|||_||___||_|__||_|_|__||__||_|__||_|_|___||||___||_|__||__||_||___|||___|
_|___||||___|||___||_|__|||__|_||___|||__|_||___|||___||_|__|||__||__|__||__||_
_|_||__|__|||__|_||___|||___|||___|||__|_|||____|||___||||___|_|||____|||___||_
_|__|||__|__|||__|__|||__|__|||___|||___||_|__|||__|__||_|__||_|_||___||_|__||_
_||___|||___||||___|_||___||_|__|||__||__||__|_||__||__|_|||___||___||_||__||__
_|__||_||___|||___|||__|_||__|||___||___|||___|||___||_|__||__|||___|||__|_||__

The results are actually highly variable:

$ python -u outputter.py > >(tee stdout.txt) 2> >(tee stderr.txt >&2)
_______________|||||||||||||||||||||||||||||||||_______________________||||||__
__||||____|||____|||||_____||||____|||||____|||||______||||____||||_____|||||__
_||||||||__________|||||_____||||____|||||____|||___||||_____|||||_____|||||___
_|||_____||||____||||||______|||||_____||||||_____||||_____|||||_____|||||____
|_______|||||||||____||||_____||||____|||||______|||||____|||||_____|||||____||_
||_____||||____|||||_____|||||_____||||____||||____||||___||||____||||____|||__
___|||||____|||||_____||||____|||____|||||_____|||||_____|||||_____||||____|||_
__||||_____||||____|||||_____|||||____||||_____|||||_____||||____||||____||||__
__|||____||||____||||____|||||____|||||_____||||_____||||||_____||||_____||||__
_||____|||||_____||||____|||||____||||____||||_____|||||_____||||____|||||____
|

$ python -u outputter.py > >(tee stdout.txt) 2> >(tee stderr.txt >&2)
_|||||||_______|_|||____|||__|_|__|||__|_||___|||___||_|___||||___|||__||___||_
________________________________________
________________________________________
________________________________________
________________________________________
________________________________________
________________________________________
________________________________________
________________________________________
________________________________________
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

My question is: is there any conceivable way to do this and also preserve the interleaving of the original streams?

Also, why doesn't the terminal have this same problem? Any implementation I can imagine would have this same issue.

1

There are 1 answers

7
lcd047 On

No, you can't do that. And it isn't because of buffering, this will produce the same random result:

python -u outputter.py > >(cat -u >&2) 2> >(cat -u >&2)

The problem is your shell needs to write to two pipes, and it uses select(2) to choose which pipe to write to. Since there is no danger of filling any pipe buffer, both pipes are "equally available" for writing, and this creates a race condition between the two cats: the one that finishes first writing a character wins. The result of the race depends on the scheduler and on what's going on at that very moment in the system. Thus the randomness.

On a side note: if the problem were related to buffering you'd have gotten the result garbled the same way every time.