I was trying to convert a data frame to a parquet file. But I faced the following error.
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow\array.pxi", line 265, in pyarrow.lib.array
File "pyarrow\array.pxi", line 80, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Expected a string or bytes dtype, got float64', 'Conversion failed for column NOTES with type float64')
The column type is varchar, so this it converts to str. But there are a few numeric values in the records of that column, and I am doubtful that the data frame parses them as float. Thus, while converting to parquet, it returns a float value that produces an error.
Is there a way to convert the values of these records to str format.
I tried using astype(str) but didn't work.
Yes, parquet expects a single type per column. To fix a case like above (i.e. mixed value types), convert it to Pandas 'string' like this: