"I read files from different sources using pandas. Currently, I'm facing an issue with a Portuguese word 'Assembléia.' When I read this word with the 'utf-8' encoding and keep it in the DataFrame, everything works well. However, when I export it to a CSV, the word changes to 'assembléia.' What should I do? I tried changing the encoding to 'latin1,' and it worked fine. But now, when I try to encode another file with 'latin1' as well, the code throws a UnicodeEncodeError."
this is a example with latin1
data = {'Palavra': ['assembléia']}
df = pd.DataFrame(data)
nome_arquivo_csv = r'C:\Users\user\OneDrive\Documents\cv - general\palavra_assembleia.csv'
df.to_csv(nome_arquivo_csv, index=False)
In this example, the CSV file displays the word 'assembléia' instead of the expected 'Assembléia.' The problem is likely
Is there a way to standardize all encoding for files?
Try: