Pandas: How to extract and calculate the number of “hour” per row in a Dataframe

Question

Pandas: How to extract and calculate the number of “hour” per row in a Dataframe

199 views Asked by Lynn At 30 September 2020 at 04:31

I have a data frame representing the schedule of some restaurants in a week.

What i want to do is to add a column week_hours to my initial Dataframe df that represents the total number of hours the restaurant is open per week.

note : the value 0 simply means that the restaurant is closed that day.

data = {
        'restaurant_id':  ['1', '2','3'],
        'Monday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Tuesday':  ['11:0-20:0', '11:30-22:0','11:30-22:0'],
        'Wednesday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Thursday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Friday':  ['11:0-22:0', '11:30-22:0','11:30-21:0'],
        'Saturday':  ['11:0-22:0', '12:0-22:0','0'],
        'Sunday':  ['11:0-17:0', '16:30-21:30','11:30-21:0',],

        }

df = pd.DataFrame (data, columns = ['restaurant_id','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday',])
df.head()

What could be a simple syntax to achieve this?

Original Q&A

There are 2 answers

Uriya Harpeness On 30 September 2020 at 04:47

You can use apply, see this answer for more info https://stackoverflow.com/a/40045819/14273548.

def calc_hours(ranges):
    # calc the hours in the ranges and sum

df['Weekly Hours'] = df.apply(lambda x: sum(calc_hours(x[day]) for day in ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday']), axis=1)

**Henry Yik** · Accepted Answer · 2020-09-30T04:48:25+00:00

One way is to first transpose and then create custom function to extract the values before calculation (note that this works only if the hours doesn't go pass 24, else need Timedelta):

def find_diff(col):
    data = col.str.extract(r"(\d*):(\d*)-(\d*):(\d*)").astype(float)
    hours = (data[2] - data[0]).sum()
    minutes = (data[3] - data[1]).sum()/60
    return hours+minutes

print (df.set_index("restaurant_id").T.apply(find_diff)) 

restaurant_id
1    64.0
2    67.5
3    58.0
dtype: float64

TechQA.

Pandas: How to extract and calculate the number of “hour” per row in a Dataframe

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in FEATURE-ENGINEERING

Popular Questions

Popular Tags

Trending Questions