I am trying to run a query with multiple filters on a data frame
Works like a charm on my small sample (below) but takes a lot of time as data increases.
import pandas as pd
df=pd.DataFrame({'ID': [FACL01, FACL02, FACL03, FACL01, FACL04, FACL06, FACL07,
FACL08, FACL09, FACL01, FACL11, FACL12],
'AMOUNT': [10, 20, 30, 40, 50, 60, 70, 80, 20, 10, 30, 10],
'DATE': [20201503, 20201503, 20201503, 20201502, 20201503, 20201502,
20201501, 20201503, 20201503, 20201501, 20201503, 20201502]})
df[AVG_AMOUNT]=0
%%time
for idx, x in df['ID'].iteritems():
df.loc[idx, 'AVG_AMOUNT']=(df[(df['DATE'].isin(M1)) & (df.ID==x)]['AMOUNT'].mean())
I am trying to get average of all AMOUNT
within 3 month period (M1)
for a particular ID
to fill in AVG_AMOUNT
.
I modified your data a bit, because you provided as many
ID
's as rows, which would make rolling means futile. I reduced it to 2 IDs, but the rest is the same:Output:
If you want to add this to your dataframe you could do something like:
Output 2: