I am new to Python and need it mostly for Stats. Here is a glimpse of the dataset:
Code City Date Sales
K1 W 1/1/2017 46506.92
K1 X 1/1/2017 187195.2
K1 Y 1/1/2017 12858.15
K1 Z 1/1/2017 25300.88
K2 W 1/1/2017 87731.47
K2 X 1/1/2017 14952.8
K3 Y 1/1/2017 167.8204
K4 A 1/1/2017 9602.108
K4 B 1/1/2017 16034.13
K4 C 1/1/2017 106.5196
K4 D 1/1/2017 1057.269
K5 W 1/1/2017 12346.57
K5 X 1/1/2017 528776.5
K5 Y 1/1/2017 7598.979
K5 Z 1/1/2017 147969.6
K6 W 1/1/2017 11770.68
K6 X 1/1/2017 180867.6
K6 Y 1/1/2017 11778.6
K6 Z 1/1/2017 48835.3
City = list of strings and same code may be in multiple cities but each Code-City combination is unique with 32 datapoints.
Data is available for a period of 32 months and is collected for 1st of each month. I need to create an array of rmse error values from individual forecasts. Each forecast is Code-City level.
I wrote a def function for ARIMA(can't use prophet contingency)
I tried to filter the DataFrame hierarchically by Code and then City for that Code by using:
df.loc[lambda x: x['Code'] in Codelist].loc[lambda x: x['City'] in Citylist]
But getting error as
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
##The way I want is for example, if Code exists in list of codes, move over to second for loop. Check if the City is present for that Code, if yes, call the defined function for ARIMA. The reason being same code exists in multiple cities.
I want to store the result which are rmse value of forecasts - actuals in an array and keep appending it after every iteration. I am expecting an array of 5 float values of forecasted Output using ARIMA forecast.
To do individual forecasts, you can take only the code, city combinations which are present in data beforehand rather than try all combinations.