I have a .txt file which contains only one line of text. For example: command1;\ncommand2, output;\ncommand3\ncommand4, output;\n (but much longer). Since it is hard to read, I want to change this file to some more readable version. I want to remove all ';' and replace '\n' with a new line.

I have few working solutions for this problem: For example I could remove all '\n' and use print function. Or, replace \\n with \n:

def clean_file(file):
    # read file
    with open(file) as f:
        content = f.readline()
    # get rid of ';' and '\n'
    content = content.split(';')
    for ind, val in enumerate(content):
        content[ind] = val.replace('\\n', '\n')  # it can be also replace(r'\n', '\n')
    # write to file
    with open(file, 'w') as f:
        for line in content:
            f.write(line)

OUT:
command1
command2, output
command3
command4, output

And in this scenario, it works properly! But I have no idea why it is not working when I remove replace part:

def clean_file(file):
    # read file
    with open(file) as f:
        content = f.readline()
    # get rid of ';'
    content = content.split(';')
    # write to file
    with open(file, 'w') as f:
        for line in content:
            f.write(line)

OUT:
command1\ncommand2, output\ncommand3\ncommand4, output\n

This will print everything in one line.

Can someone explain to me why I have to replace '\n' with the same value? The file was created, and I am opening it on windows, but the script I am running on Linux.

2 Answers

0
Hoog On

You are not replacing the same value, you are removing the \ before \n. When handling strings a backslash often means that you have a fancy character (such as newline \n, tab \t, etc..), BUT sometimes you want to print an actual backslash! To do this in python we use \\ to add in a single backslash.

So, when printing out in your first example, python comes up to \n and thinks "new line", in your second example python sees \\n so the first two \ mean print a backslash, then the n is treated and printed like a normal n

1
Serge Ballesta On

Most editors in the Windows world (starting with notepad) require \r\n to correctly display an end of line and ignore \n alone. On the other hand, on Linux a single \n is enough for an end of line. If you run a Python script on Windows, it will be smart enough to automatically replace any '\n' with a \r\n at write time and symetrically replace \r\n from a file with a single \n provided the file is opened in text mode. But nothing of that will happen on Linux.

Long story short, text files have different end of lines on Linux and Windows, and text files having \r\n are known as dos text files on Linux.

You have probably been caught by that, and the only way to be sure is to open the file in binary mode and display the byte values (in hex to be more readable for people used to ASCII code)