I calculate the delta between two columns of type pd.Period. Before 0.24, it will return a type int, but not any more in 0.24 which will return something like a series of 1 * MonthEnds, 2 * MonthEnds, ... I want to convert this into int type.

I can use apply to make this happen, for example,

df.apply(lambda x: x['z'].n)

or

((df['x'] - df['y']) / np.timedelta64(1, 'M')).round()

But I want to know whether there is another work around.

df = pd.DataFrame({'x':pd.date_range(start='2001-01-01', periods=10), 'y':pd.date_range(start='2002-01-01', periods=10)})

Before Pandas 0.24, the following code will return a column in int type

df['z'] = df['x'].dt.to_period('M') - df['y'].dt.to_period('M')

but 0.24 change the return type, as mentioned above there are two ways to still return an int column, but I would like to know if there are other ways to make this happen.

2 Answers

0
Quang Hoang On

One way is the good old list comprehension:

df['z'] = [a.year*12 + a.month - b.year*12 - b.month for a,b in zip(df.x, df.y)]
0
Chris On

Using astype will return an int rather than a DateOffset object:

df['x'].dt.to_period('M').astype(int) - df['y'].dt.to_period('M').astype(int)

0   -12
1   -12
2   -12
3   -12
4   -12
5   -12
6   -12
7   -12
8   -12
9   -12
dtype: int64