I get an error in a production system, which I fail to reproduce in a development environment:
with io.open(file_name, 'wt') as fd:
fd.write(data)
Exception:
File "/home/.../foo.py", line 18, in foo
fd.write(data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 6400: ordinal not in range(128)
I already tried to but a lot of strange characters into the variable data
.
But up to now I was not able to reproduce an UnicodeEncodeError
.
What needs to be in data
to get an UnicodeEncodeError
?
Update
python -c 'import locale; print locale.getpreferredencoding()'
UTF-8
Update2
If I call locale.getpreferredencoding()
via shell and via web request, the encoding is "UTF-8".
I updated my exception handling in my code and log the getpreferredencoding()
since some days. Now it happened again (up to now I am not able to force or reproduce this), and the encoding is "ANSI_X3.4-1968"!
I have no clue where this encoding gets set ....
This puts my problem into a different direction. Leaving this question useless. My problem is now: Where does the preferred encoding get altered? But this is not part of this question.
A big thank you, for all who
You are relying on the default encoding for the platform; when that default encoding can't support the Unicode characters you are writing to the file, you get an encoding exception.
From the
io.open()
documentation:For your specific situation, the default returned by
locale.getpreferredencoding()
is ASCII, so any Unicode character outside the ASCII range would cause this issue, U-0080 and up.Note that the locale is taken from your environment; if it is ASCII, that typically means the locale is set to the POSIX default locale,
C
.Specify the encoding explicitly:
I used UTF-8 as an example; what you pick depends entirely on your use cases and the data you are trying to write out.