SCIPY Kolmogorov Smirnov test yielding small p-values even with random data generated from given distribution

Question

SCIPY Kolmogorov Smirnov test yielding small p-values even with random data generated from given distribution

317 views Asked by Imp At 15 September 2022 at 21:51

data = np.random.multivariate_normal(mean=[0, 0], cov=[[1, 0], [0, 1]], size=1000)

cdfx = multivariate_normal(mean=[0, 0], cov=[[1, 0], [0, 1]]).cdf

ks_1samp(x=data, cdf=cdfx)

KstestResult(statistic=0.9930935227267083, pvalue=0.0)

Shouldn't the P-value be high?

Original Q&A

There are 1 answers

**Warren Weckesser** · Answer 1 · 2022-09-15T22:09:25+00:00

The Kolmogorov-Smirnov test is for univariate distributions. See the section "The Kolmogorov–Smirnov statistic in more than one dimension" for a discussion of a multivariate generalization.

ks_1samp expects the input x to be one-dimensional, and it expects the cdf function to be the CDF of a univariate distribution. It does not validate these properties, so the behavior is undefined (and, clearly, nonsense) if the expectations are not met.

With the univariate normal distribution, it works as you expect:

In [20]: from scipy.stats import ks_1samp, norm

In [21]: x = norm.rvs(size=1000)

In [22]: ks_1samp(x, norm.cdf)
Out[22]: KstestResult(statistic=0.025983100250768443, pvalue=0.5011047711453744)

TechQA.

SCIPY Kolmogorov Smirnov test yielding small p-values even with random data generated from given distribution

There are 1 answers

Related Questions in PYTHON

Related Questions in SCIPY

Related Questions in P-VALUE

Related Questions in KOLMOGOROV-SMIRNOV

Related Questions in GOODNESS-OF-FIT

Popular Questions

Trending Questions