Why does opening a file in two different encodings work as expected?

127 views Asked by At

Quoting from here,

The default encoding is platform-dependent, so this code might work on your computer (if your default encoding is utf-8), but then it will fail when you distribute it to someone else (whose default encoding is different, like CP-1252).

Code mentioned in the above quote:

fp = open('text.txt') # Assuming file exists
a_string = file.read()

I have created a file named text.txt (with random contents) in the current directory and the encoding of it is "ANSI 1252" (checked using notepad++). I have checked the default encoding of my system(windows) using

import locale
print(locale.getpreferredencoding())

which gives the output

cp1252

The code to read the file (which I've provided just below the quote) works as expected. It works even when I used

fp = open('text.txt', encoding='utf-8') # or `fp = open('text.txt', encoding='cp1252')`

How does the above code work for two different encodings? Shouldn't it give a UnicodeDecodeError or something like that?

2

There are 2 answers

0
mike3996 On BEST ANSWER

Decode would only fail if the input contains characters outside the encoding mapping. If your file is purely ASCII, it will be read in exactly the same in both cases.

0
cheesysam On

Looking here, the mappings are the same.

And from what I understand, the unicode standard was designed to be backwards compatible with ascii.