UnicodeDecodeError: with apply function in column for each row

Question

UnicodeDecodeError: with apply function in column for each row

221 views Asked by Fatima At 02 December 2020 at 13:12

I have a dataframe and I want to encode each word in my column by using soundex, so I have to use split because Soundex take only the first word

then I apply this line of code but I got this error:

table['soundex'] = table['name'].apply(lambda x:' '.join([jellyfish.soundex(i) for i in x.split()]))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 0: invalid continuation byte

and when I tried to apply it in other columns it works and they all same data type

my data source is a database and I have create a name column through cleansing steps I mean it is not original from the data source.

most of the solutions with UnicodeDecodeError coming with read CSV files and in my case I do not know what causes this error

random sample of data and expected output:

name                       soundex
hospital food              H213 F300
good after noon            G300 A136 N500
hi                         h000

any help?

Original Q&A

There are 1 answers

**Fatima** · Answer 1 · 2020-12-02T21:40:31+00:00

Fatima On 02 December 2020 at 21:40

I have solved it by remove non-English character using this line of code:

table.name=table.name.str.encode('ascii', 'ignore').str.decode('ascii')

reference:

https://stackoverflow.com/a/56744855/10718214

TechQA.

UnicodeDecodeError: with apply function in column for each row

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in SOUNDEX

Popular Questions

Popular Tags

Trending Questions