Python Pandas Calculating Percentile per row

Question

Python Pandas Calculating Percentile per row

5.1k views Asked by Travis At 29 December 2024 at 04:46

I have the following code and would like to create a new column per Transaction Number and Description that represents the 99th percentile of each row.

I am really struggling to achieve this - it seems that most posts cover calculating the percentile on the column.

Is there a way to achieve this? I would expect a new column to be create with two rows.

df_baseScenario = pd.DataFrame({'Transaction Number' : [1,10],
                            'Description'      :['asf','def'],
                            'Calc_PV_CF_2479.0':[4418494.085,-3706270.679],
                            'Calc_PV_CF_2480.0':[4415476.321,-3688327.494],
                            'Calc_PV_CF_2481.0':[4421698.198,-3712887.034],
                            'Calc_PV_CF_2482.0':[4420541.944,-3706402.147],
                            'Calc_PV_CF_2483.0':[4396063.863,-3717554.946],
                            'Calc_PV_CF_2484.0':[4397897.082,-3695272.043],
                            'Calc_PV_CF_2485.0':[4394773.762,-3724893.702],
                            'Calc_PV_CF_2486.0':[4384868.476,-3741759.048],
                            'Calc_PV_CF_2487.0':[4379614.337,-3717010.873],
                            'Calc_PV_CF_2488.0':[4389307.584,-3754514.639],
                            'Calc_PV_CF_2489.0':[4400699.929,-3741759.048],
                            'Calc_PV_CF_2490.0':[4379651.262,-3714723.435]})

Original Q&A

There are 2 answers

ashishsingal On 03 January 2017 at 15:12

Slightly modified from @mxbi.

import numpy as np
df = df_baseScenario.drop(['Transaction Number','Description'], axis=1)
df_baseScenario['99th_percentile'] = df.apply(lambda x: np.percentile(x, 99), axis=1)

**mxbi** · Accepted Answer · 2017-01-03T15:05:23+00:00

The following should work:

df['99th_percentile'] = df[cols].apply(lambda x: numpy.percentile(x, 99), axis=1)

I'm assuming here that the variable 'cols' contains a list of the columns you want to include in the percentile (You obviously can't use the Description in your calculation, for example).

What this code does is loops over rows in the dataframe, and for each row, computes the numpy.percentile to get the 99th percentile. You'll need to import numpy.

If you need maximum speed, then you can use numpy.vectorize to remove all loops at the expense of readability (untested):

perc99 = np.vectorize(lambda x: numpy.percentile(x, 99))
df['99th_percentile'] = perc99(df[cols].values)

TechQA.

Python Pandas Calculating Percentile per row

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in PERCENTILE

Popular Questions

Popular Tags

Trending Questions