Calculate the average of multiple columns, based on value of the first column

55 views Asked by At

Giving the following column

Name score1 score2 score3
Bob 100 120 900
Lisa 40 120 90
Bob 590 490 80
Tim 100 120 900
Tim 40 120 90
Bob 590 490 80

I would like to calculate the average of all columns, for each person in the Name column. So for Bob, I would like to get one average of all 9 values.

I know that the code below will calculate the average of multiple columns. How can I make it calculate the average for all rows with the same name?

df['averages'] = df[['Score1', 'Score2', 'Score3']].mean(axis=1)
2

There are 2 answers

0
mozway On

You could use a groupby.apply and numpy.mean:

df.set_index('Name').groupby(level=0).apply(lambda g: np.mean(g.to_numpy()))

Or stack:

df.set_index('Name').stack().groupby(level=0).mean()

Or compute the sum and divide by the number of cells:

g = df.groupby('Name')

out = g.sum().sum(axis=1).div(g['Name'].size()*(df.shape[1]-1))

Output:

Name
Bob     382.222222
Lisa     83.333333
Tim     228.333333
dtype: float64
0
PaulS On

Another possible solution:

df.pivot_table(index='Name', aggfunc='mean').mean(axis=1)

Output:

Name
Bob     382.222222
Lisa     83.333333
Tim     228.333333
dtype: float64