How to get the standard deviation from the fitted distribution in scipy?

648 views Asked by At

I'm trying to fit multiple distributions and get the standard deviation for each. However plenty distributions retrun either inf or Nan for the standard deviation. Is the way of getting the variance of the fitted distribution that I'm doing is correct? Is there a better way? why the Nans? This is what I have done:

param = distribution.fit(data)
arg = param[:-2]
loc = param[-2]
scale = param[-1]

if len(arg)>0:
     std = np.sqrt(distribution.stats(arg, loc, scale, moments='v')[0]))
else:
     std = np.sqrt(distribution.stats(loc, scale, moments='v')[0]))

Also I skip distributions that generate a warning while fitting the data.

Update 1: For instance, when distribution = scipy.stats.beta, I get [ nan nan] and the parameters are as follows:

arg: (32.198726690922953, 15883184.284202889)
loc: -33527.5754686
scale: 35484135514.4 
2

There are 2 answers

0
user2179347 On BEST ANSWER

I've asked a different question on Stack Overflow and got a solution that answered this question too. It turned out that the parameters that I have passed were interpreted differently by scipy. Here is the link to the answers:

isinfmu-error-in-scipy-stats-when-calling-std-for-exponweib

1
ilanman On

The variance of a beta distribution is:

a * b / [ (a + b)^2 * (a + b + 1) ]

So the standard deviation is the square root of that. To get a and b:

a = scipy.stats.beta.fit(data)[0]
b = scipy.stats.beta.fit(data)[1]

Note that you can always calculate the standard deviation of your data (absent any fitted distribution) using np.std(data).