Python Calculate New Date Based on Date Range

79 views Asked by At

I have a Python Pandas DataFrame containing birth dates of hockey players that looks like this:

Player         Birth Year    Birth Date
Player A         1990        1990-05-12
Player B         1991        1991-10-30
Player C         1992        1992-09-10
Player D         1990        1990-11-15

I want to create a new column labeled 'Draft Year' that is calculated based on this rule:

If MM-DD is before 09-15, Draft Year = Birth Year + 18
Else if MM-DD is after 09-15 Draft Year = Birth Year + 19

This would make the output from the example:

Player         Birth Year    Birth Date     Draft Year
Player A         1990        1990-05-12      2008
Player B         1991        1991-10-30      2010
Player C         1992        1992-09-10      2010
Player D         1990        1990-11-15      2009

I've tried separating the MM-DD from the date format by using

Data['Birth Date'] = Data['Birth Date'].str.split('-').str[1:]

But that returns me a list of [mm, dd] which is tricky to work with. Any suggestions on how to do this concisely would be greatly appreciated!

3

There are 3 answers

0
Code Different On BEST ANSWER

Datetime in the form yyyy-mm-dd are sortable as strings. This solution takes advantage of that fact:

df['Draft Year'] = df['Birth Year'] + np.where(df['Birth Date'].dt.strftime('%m-%d') < '09-15', 18, 19)
0
ansev On

Use numpy.where:

data['Birth Date']=pd.to_datetime(data['Birth Date']) #to convert to datetime
cond=(df['Birth Date'].dt.month>=9)&(df['Birth Date'].dt.day>=15)
cond2=(df['Birth Date'].dt.month>=10)
data['Draft Year']=np.where(cond|cond2,data['Birth Year']+19,data['Birth Year']+18)

print(data)

Output

    Player  Birth Year Birth Date  Draft Year
0  PlayerA        1990 1990-05-12        2008
1  PlayerB        1991 1991-10-30        2010
2  PlayerC        1992 1992-09-10        2010
3  PlayerD        1990 1990-11-15        2009
0
piRSquared On

Quick and Dirty

Make a column that is 100 * the month and add it to the day

cutoff = df['Birth Date'].pipe(lambda d: d.dt.month * 100 + d.dt.day)
df['Draft Year'] = df['Birth Year'] + 18 + (cutoff > 915)

df

     Player  Birth Year Birth Date  Draft Year
0  Player A        1990 1990-05-12        2008
1  Player B        1991 1991-10-30        2010
2  Player C        1992 1992-09-10        2010
3  Player D        1990 1990-11-15        2009