Linked Questions

Popular Questions

I have a dataframe with different datatypes like bool, int, float, datetime, category. currently I am converting

# Earlier to pandas 2.0

1. object -> string
2. object -> datetime[ns] # if date

With new pandas 2.0 or above, I am trying to use pyarrow datatypes for all fields and saving in parquet format.

We can have below:

int8 -> int8[pyarrow] likewise for other int's type
float16 -> float16[pyarrow] likewise for other float's type
string or object -> string[pyarrow]

eg:

df['col_int'] = df['col_int'].astype('int8[pyarrow]')

I did not find much on how to convert datetime and category using astype() for below:

1. datetime -> timestamp # if date
2. category -> dictionary

eg:

df['col_date'] = df['col_date'].astype(???)
df['col_dictionary'] = df['col_dictionary'].astype(???)

Please help.

Related Questions