I have a csv file that is read by a python script and it throws an error whenever there are Serbian latin alphabet letters in the file. The decoder decodes these letters into nothing. Is there a way to somehow give instructions into what it needs to be decoded or change it somehow without going through all of the strings in the file.
This is the error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1567: character maps to <undefined>
The only way i see it could be done is by replacing all of the characters into English latin equivalent letters but that is very time consuming.
Your error message indicates that you are using a charmap codec. The Python docs have this to say about them:
(Emphasis added.)
It is not particularly surprising, then, that you discovered some characters your codec cannot handle.
It's not clear how your file is encoded. UTF-8 appears likely from the error message, but there are several other possibilities. It's also not clear how your script ends up choosing a charmap codec for decoding the file, or which one it chooses. Whatever part of your code is choosing the codec needs to select one appropriate for the file's actual encoding instead.
Alternatively, it may be that the script is specific to a file format that does not support the characters you're asking about. If that's the case then the error is not in the script but in the data.