What is the difference between statistics.stdev() & numpy.std() and which is more precise?

Question

What is the difference between statistics.stdev() & numpy.std() and which is more precise?

678 views Asked by whocares At 27 May 2022 at 12:07

I used this dataset:

lst = [81922.00557103065, 82887.70053475935, 80413.01627033792,
       81708.86075949368, 82997.38219895288, 84641.50943396226,
       81929.82456140351, 82632.24181360201, 77667.98418972333,
       73726.47427854454, 86113.2075471698, 83232.98429319372,
       79866.66666666667, 83833.74689826302, 81943.06930693069,
       77898.64029666255, 77401.47783251232, 80607.59493670886,
       78384.5126835781, 82608.69565217392, 80824.8730964467,
       84163.70106761566, 74887.38738738738
       ]

Then statistics.stdev(lst) is 3096.28 and numpy.std(lst) is 3028.23. The difference is about 2.2%.

Original Q&A

There are 1 answers

**Matt Hall** · Accepted Answer · 2022-05-27T12:36:12+00:00

They are calculating two slightly different things.

The standard deviation is the square root of the variance. NumPy is using the sample variance, whereas statistics is adjusting this with Bessel's correction. This uses N – 1 instead of N in the calculation of the variance:

arr = np.array(lst)
var_ordinary = np.sum(abs(arr - arr.mean())**2) / arr.size
var_bessel = np.sum(np.abs(arr - arr.mean())**2) / (arr.size - 1)

From the statistics docs:

This is the sample variance s² with Bessel’s correction, also known as variance with N-1 degrees of freedom. Provided that the data points are representative (e.g. independent and identically distributed), the result should be an unbiased estimate of the true population variance.

TechQA.

What is the difference between statistics.stdev() & numpy.std() and which is more precise?

There are 1 answers

Related Questions in NUMPY

Related Questions in STATISTICS

Related Questions in STD

Related Questions in STDEV

Popular Questions

Popular Tags

Trending Questions