Pandas: How to extract and calculate the number of “hour” per row in a Dataframe

209 views Asked by At

I have a data frame representing the schedule of some restaurants in a week.

  • What i want to do is to add a column week_hours to my initial Dataframe df that represents the total number of hours the restaurant is open per week.

note : the value 0 simply means that the restaurant is closed that day.

data = {
        'restaurant_id':  ['1', '2','3'],
        'Monday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Tuesday':  ['11:0-20:0', '11:30-22:0','11:30-22:0'],
        'Wednesday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Thursday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Friday':  ['11:0-22:0', '11:30-22:0','11:30-21:0'],
        'Saturday':  ['11:0-22:0', '12:0-22:0','0'],
        'Sunday':  ['11:0-17:0', '16:30-21:30','11:30-21:0',],

        }

df = pd.DataFrame (data, columns = ['restaurant_id','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday',])
df.head()

What could be a simple syntax to achieve this?

2

There are 2 answers

2
Henry Yik On BEST ANSWER

One way is to first transpose and then create custom function to extract the values before calculation (note that this works only if the hours doesn't go pass 24, else need Timedelta):

def find_diff(col):
    data = col.str.extract(r"(\d*):(\d*)-(\d*):(\d*)").astype(float)
    hours = (data[2] - data[0]).sum()
    minutes = (data[3] - data[1]).sum()/60
    return hours+minutes

print (df.set_index("restaurant_id").T.apply(find_diff)) 

restaurant_id
1    64.0
2    67.5
3    58.0
dtype: float64
0
Uriya Harpeness On

You can use apply, see this answer for more info https://stackoverflow.com/a/40045819/14273548.

def calc_hours(ranges):
    # calc the hours in the ranges and sum

df['Weekly Hours'] = df.apply(lambda x: sum(calc_hours(x[day]) for day in ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday']), axis=1)