Is there a way to plot densities using data that has observation weights?
I have a vector of observations x
and a vector of integer weights y
, such that y1
indicates how many observations we have of x1
. That is, the density of
x y
1 2
2 2
2 3
is equal to the density of 1, 1, 2, 2, 2, 2 ,2
(2x1, 5x2). As far as I understand it,
matplotlib.pyplot.hist(weights=y)
allow for observation weights when plotting the histogram. Is there any equivalent for computing and plotting the density?
The reason I want the package to be able to do this is that my data is very big, and I'm looking for a more efficient alternative.
Alternatively, I'm open to other packages.
Statsmodels' kde univariate receives weights in its fit function. See the output of the following code.
Output:
Note: Your time concern regarding array creation will probably not be resolved with this. Because as noted in the source code: