I have a dataset. A part of the dataset is as shown below:
sales_date net_sales_my_firm net_sales_others pro_unit_my_firm pro_unit_others
1.02.2021 101089 710337 9869 67885
1.03.2021 104747 598684 9084 79405
1.04.2021 92027 623285 8025 122489
1.05.2021 85796 463898 7541 63562
1.06.2021 112804 621633 10553 83586
1.07.2021 89326 484894 7832 61799
1.08.2021 85406 524195 7551 75599
1.09.2021 131388 686136 12144 87755
net_sales_my firm: Net sales of my company net_sales_others: Net sales of competitors pro_unit_my_firm: promotion sales of my company pro_unit_others: bonus sales of competitors
What I want to do is find the effect of promotional sales on net sales. For this, I used the multiple regression code I specified below (in python).
Y = df.net_sales_my_firm
X = df[['pro_unit_my_firm','pro_unit_others']]
X = sm.add_constant(X)
model = sm.OLS(Y, X)
results = model.fit()
Results summary is:
print(results.params)
const -14896.842089
pro_unit_my_firm 4.163607
pro_unit_others 0.806564
I interpreted this result as follows: If you promote 1 unit, you increase your sales by 5 units. But; What does a negative constant value mean? Is it normal? Did I set up a wrong model?
I'm also sharing the scatterplot of the yield to help:
I would have interpreted it as follows:
"If you promote by 0 units, and competitors promote by 0 units you will have negative net sales".
Maybe this isn't completely unrealistic. If you don't promote your product at all maybe you won't sell much, and so your net sales might be negative (depending exactly what net sales means).
This isn't what is implied by your plot where you have several points at 0
pro_unit_my_firm
but of course it's not clear what the values ofpro_unit_others
at that point are.To reassure myself that the fit 'makes sense' I'd want to inspect the values of
pro_unit_others
for the points wherepro_unit_my_firm
are zero.