I used this dataset:
lst = [81922.00557103065, 82887.70053475935, 80413.01627033792,
81708.86075949368, 82997.38219895288, 84641.50943396226,
81929.82456140351, 82632.24181360201, 77667.98418972333,
73726.47427854454, 86113.2075471698, 83232.98429319372,
79866.66666666667, 83833.74689826302, 81943.06930693069,
77898.64029666255, 77401.47783251232, 80607.59493670886,
78384.5126835781, 82608.69565217392, 80824.8730964467,
84163.70106761566, 74887.38738738738
]
Then statistics.stdev(lst)
is 3096.28 and numpy.std(lst)
is 3028.23. The difference is about 2.2%.
They are calculating two slightly different things.
The standard deviation is the square root of the variance. NumPy is using the sample variance, whereas
statistics
is adjusting this with Bessel's correction. This uses N – 1 instead of N in the calculation of the variance:From the
statistics
docs: