Trying to extract year from dataset in python
df["YYYY"] = pd.DatetimeIndex(df["Date"]).year
year
appears as decimal point in the new column.
YYYY
2001.0
2002.0
2015.0
2022.0
How to just have year appear with no decimal points?
Trying to extract year from dataset in python
df["YYYY"] = pd.DatetimeIndex(df["Date"]).year
year
appears as decimal point in the new column.
YYYY
2001.0
2002.0
2015.0
2022.0
How to just have year appear with no decimal points?
You likely have null values in you input resulting in NaNs and a float type for your column.
No missing values:
pd.DatetimeIndex(['2022-01-01']).year
Int64Index([2022], dtype='int64')
Missing values:
pd.DatetimeIndex(['2022-01-01', '']).year
Float64Index([2022.0, nan], dtype='float64')
I suggest to use pandas.to_datetime
combined with convert_dtypes
:
pd.to_datetime(pd.Series(['2022-01-01', ''])).dt.year.convert_dtypes()
0 2022
1 <NA>
dtype: Int64
Or to extract directly the year from the initial strings. But for that we would need a sample of the input.
sample program for your problem
pandas takes care of date by itself
if not we can directly specify as
hope it will make things clear to you.
if not can you specify the df samples