Italian dected as iso-8859-2

512 views Asked by At

I am using chardet to detect encoding of text files including Italian. The problem is it consistently detects their encoding as iso-8859-2 while the correct detection would be iso-8859-1. Does anybody know a fix? My local language is set to Polish? Could that influence the detection?

1

There are 1 answers

0
Niklas9 On BEST ANSWER

chardet doesn't support iso-8859-1, that's why it's not detecting it. For supported character encodings, see chardets homepage - http://pypi.python.org/pypi/chardet.

I use the Linux program 'file' to get the character encoding of different content, however I'm not sure how safe it is, see my question - Encoding detection in Python, use the chardet library or not?. But it works with great results for me so far.

Btw, your local language should not influence the detection.