I guess this is a simple question, but I can't sort it out. I have a vector, the first elements of which look like:
V = [31 52 38 29 29 34 29 24 25 25 32 28 24 28 29 ...];
and I want to perform a chi2gof
test in Matlab to test if V
is exponentially distributed. I did:
[h,p] = chi2gof(V,'cdf',@expcdf);
but I get a warning message saying:
Warning: After pooling, some bins still have low expected counts.
The chi-square approximation may not be accurate
Have I defined the chi2gof
call incorrectly?
At 36 values, you have a very small sample set. From the second sentence of Wikipedia's article on the chi-squared test (emphasis added):
Large in this case usually means around at least 100. Read about more assumptions of this test here.
Alternatives
You might try
kstest
in Matlab, which is based on the Kolmogorov-Smirnov test:Or try
lillietest
, which is based on the Lilliefors test and has an option specifically for exponential distributed data:In case you can increase your sample size, you are doing one thing wrong with
chi2gof
. From thehelp
for the'cdf'
option:You're not supplying any additional parameters, so
expcdf
is using the default mean parameter ofmu = 1
. Your data values are very large and don't correspond at all an exponential distribution with this mean. You need to estimate parameters as well. You theexpfit
function, which is basted on maximum likelihood expectation, you might try something like this:However, with only 36 samples you may not get a very good estimate for a distribution like this and still may not get expected results even for data sampled from a known distribution, e.g.: