When piping a file in windows to a python script, my \r are deleting my characters

308 views Asked by At

I have a file like this:

A\r
B\n
C\r\n.

(By \r I'm referring to CR, and \n is LF)

And this script:

import fileinput
for line in fileinput.input(mode='rU'):
    print(line)

When I call python script.py myfile.txt I get the correct output:

A
B
C

But when I call it like this: type myfile.txt|python script.py, I get this:

B
C

You see? No more "A".

What is happening? I thought the mode='rU' would take care of every newline problem...

EDIT: In Python 3 there is no such problem! Only in Python 2. But that does not solve the problem.

Thanks

EDIT:

Just for the sake of completeness. - It happens also in Linux.

  • Python 3 handles every newline type (\n, \r or \r\n) transparently to the user. Doesn't matter which one your file got, you don't have to worry.
  • Python 2 needs the parameter mode='rU' passed to fileinput.input to allow it to handle every newline transparently. The thing is, in Python 2 this does not work correctly when piping content to it. Having tried to pipe a file like this:

    CR: \r
    LF: \n
    CRLF: \r\n
    

Python 2 just treats these two lines as just one line and if you try to print every line with this code:

for i,line in enumerate(fileinput.input(mode='rU')):
    print("Line {}: {}".format(i,line), end='')

It outputs this:

Line 0: CR:
LF:
Line 1: CRLF:

This doesn't happen in Python 3. There, these are 2 different lines. When passing this text as a file, it works ok though.

Piping data like this:

LF: \n    
CR: \r
CRLF: \r\n

Gives me a similar result:

Line 0: LF: 
Line 1: CR:
CRLF:

My conclusion is the following:

For some reason, when piping data, Python 2 looks for the first newline symbol it encounters and then on, it just considers that specific character as a newline. In this example Python 2 encounters \r as the first newline character and all the others (\n or \r\n) are just common characters.

0

There are 0 answers