Hi have a set of observations
obs
= https://drive.google.com/file/d/0B3vXKJ_zYaCJVlhqd3FJT0xtWFk/view?usp=sharing
I would like to prove that they come from a Gamma distributions.
To do that I:
%estimate parameters gamma distribution
paramEsts_gamma = gamfit(obs);
% estimate cdf gamma distribution (objects)
gamma_cdf=makedist('Gamma','a',paramEsts_gamma(1),'b',paramEsts_gamma(2));
% test with kstest if data comes from a gamma distribution
[h_gamma_ks,p_gamma_ks,kstat_gamma_ks,cv_gamma_ks] = kstest(obs,'CDF',gamma_cdf)
% test with chi2gofif data comes from a gamma distribution
pd_gamma = fitdist(obs,'Gamma');
[h_gamma_chi,p_gamma_chi,st_gamma_chi] = chi2gof(obs,'CDF',pd_gamma)
My problem is that I get NaN for the pvalue p_gamma_chi
....
Where do I make a mistake?
Thanks
Here some code to check visually the distributions
%% Plot cdf
% empirical cdf
[f_emp,x_values] = ecdf(obs);
f_gamma = gamcdf(x_values,paramEsts_gamma(1),paramEsts_gamma(2));
figure
hold on;
F = plot(x_values,f_emp);
set(F,'LineWidth',2);
G = plot(x_values,f_gamma,'r-');
set(G,'LineWidth',2);
legend([F G],...
'Empirical CDF','Gamma CDF',...
'Location','SE');
As the output of your code shows
st_gamma_chi.df = 0
, which means 0 degrees-of-freedom (dof
).where:
N
is the number of frequencies, in your caseN = length(st_gamma_chi.edges)-1 = 3
;n
is the number of fitted parameters, in your casen = 2
.Thus you get 0 dof with the default options, you can ameliorate this issue for example by increasing the number of bins where the frequencies are calculated:
[h_gamma_chi,p_gamma_chi,st_gamma_chi] = chi2gof(obs,'CDF',pd_gamma, 'NBins', 20)
But this will not exempt you from understanding chi-squared test.