How do I standardize correct encoding from differnets sources with pd.read_csv and pd.to_csv?

Question

How do I standardize correct encoding from differnets sources with pd.read_csv and pd.to_csv?

34 views Asked by user202135 At 23 July 2023 at 13:02

"I read files from different sources using pandas. Currently, I'm facing an issue with a Portuguese word 'Assembléia.' When I read this word with the 'utf-8' encoding and keep it in the DataFrame, everything works well. However, when I export it to a CSV, the word changes to 'assemblÃ©ia.' What should I do? I tried changing the encoding to 'latin1,' and it worked fine. But now, when I try to encode another file with 'latin1' as well, the code throws a UnicodeEncodeError."

this is a example with latin1


data = {'Palavra': ['assembléia']}

df = pd.DataFrame(data)

nome_arquivo_csv = r'C:\Users\user\OneDrive\Documents\cv - general\palavra_assembleia.csv'
df.to_csv(nome_arquivo_csv, index=False)

In this example, the CSV file displays the word 'assemblÃ©ia' instead of the expected 'Assembléia.' The problem is likely

Is there a way to standardize all encoding for files?

Original Q&A

There are 1 answers

**gtomer** · Answer 1 · 2023-07-23T13:06:37+00:00

gtomer On 23 July 2023 at 13:06

Try:

df.to_csv(nome_arquivo_csv, index=False, encoding='utf-8')

TechQA.

How do I standardize correct encoding from differnets sources with pd.read_csv and pd.to_csv?

There are 1 answers

Related Questions in PANDAS

Related Questions in ENCODING

Related Questions in UTF-8

Related Questions in ISO-8859-1

Popular Questions

Trending Questions