I am working on replicating the scoring rule found in a paper Forecasting the intermittent demand for slow-moving inventories: A modelling approach
The paper describes the scoring rule as follows:
This is my attempt
y <- rpois(n = 100, lambda = 10) # forecasted distribution
x <- 10 # actual value
drps_score <- function(x = value, y = q){
# x = actual value (single observation); y = quantile forecasted value (vector)
Fy = ecdf(y) # cdf function
indicator <- ifelse(y - x > 0, 1, 0) # Heaviside
score <- sum((indicator - Fy(y))^2)
return(score)
}
> drps_score(x = x, y = y)
[1] 53.028
This seems to work well until I provide a vector of 0s as follows:
y <- rep(x = 0, 100)
> drps_score(x = x, y = y)
[1] 0
I know that one of their methods used in this paper was a 0s forecast and their results did not show 0 for DRPS. This makes me think that the calculation is off.
I think there are a few issues at play here.
First off, I don't think you are computing the correct sum inside the scoring function. The score asks you to sum across all possible values of y (i.e. across all positive integers) not across all forecasted samples of y.
Second, I don't think the above definition gives the desired result, with \hat F (y) defined to be 0 when y=x then you don't get a zero score for a forecast with a point mass at the true value. (Yes, I'm saying that source is "wrong", or at least has a definition that doesn't give the desired result.) Here is a re-formulated function that I think fixes both issues:
The above shows that the distribution that is centered on the true value (lambda=10) has the lowest score for distributions that aren't a point mass.