Calculate correction factor in Python

612 views Asked by At

I have three dataframes:

true false category 
36   25    3
40   25    3
46   23    3
40   22    5
42   20    4
56   39    3
50   40    3
44   27    4
51   39    5
54   31    5
50   38    4

I try to calculate for each category a correction factor to correct the "false"-values. So f.e. for category 5:

correction1 = 40/22 = 1.82
correction2 = 51/39 = 1.31
correction3 = 54/31 = 1.74

Then the arithmetic mean of these correction factors is 1.62. So, the result should be a averaged correction factor for each category.

Question: Is there a built-in function in Python/NumPy to calculate this?

2

There are 2 answers

0
jottbe On BEST ANSWER

You can do this as follows:

(df['true'].div(df['false'])).groupby(df['category']).mean()

This just builds the ratios, then groups them by category to finally calculate the mean.

To get the testdata:

from io import StringIO

infile= StringIO(
"""true false category 
36   25    3
40   25    3
46   23    3
40   22    5
42   20    4
56   39    3
50   40    3
44   27    4
51   39    5
54   31    5
50   38    4""")
df= pd.read_csv(infile, sep='\s+', dtype='int16')

The result is:

category
3    1.545179
4    1.681806
5    1.622603
dtype: float64
0
tatarana On

in case you want to stick with numpy:

import numpy as np
ratio = np.array(df1) / np.array(df2)
df3 = np.array(df3)
mean = {c : np.mean(ratio[df3 == c]) for c in set(df3)}

output on your data example:

{3: 1.5451794871794873, 4: 1.68180636777128, 5: 1.6226032032483646}

but I do like jottbe's answer and in case you are already using pandas then that's probably the way.