Pandas div with multiple index

681 views Asked by At

I have been facing the following problem. I have a dataframe with multiple index (three here):

df = pd.DataFrame(np.random.randint(2, 8, size = (8, 1)))
df.index = pd.MultiIndex.from_tuples([(1990, 'Women','type_A'), (1990, 'Women','type_B'),(1990, 'Men','type_A'), (1990, 'Men','type_B'), 
(1991, 'Women','type_A'), (1991, 'Women','type_B'),(1991, 'Men','type_A'), (1991, 'Men','type_B')])
df.index.names = ['Year', 'Gender','Type']
df.columns = ['Total']

which looks like:

                     Total
Year Gender Type         
1990 Women  type_A      5
            type_B      7
     Men    type_A      6
            type_B      2
1991 Women  type_A      2
            type_B      6
     Men    type_A      3
            type_B      5

I have been trying to compute the share of each Type and Gender by Year but I have not found any clear answer on SOF. At the end of the day I need to get the following df:

                     Share
Year Gender Type          
1990 Women  type_A  0.4166
            type_B  0.5833
     Men    type_A  0.7500
            type_B  0.2500
1991 Women  type_A  0.2500
            type_B  0.7500
     Men    type_A  0.3750
            type_B  0.6250

Normally, I would do it using div function but it does not seem to work here with more than one index. Has someone faced a similar situation ? Thanks in advance !

1

There are 1 answers

2
Psidom On BEST ANSWER

One option would be to calculate the sum group by year and gender and then divide the original data frame by the sum (the result is slightly different because you didn't set seed for the random generator):

df/df.groupby(level=[0, 1]).transform('sum')

enter image description here