p = NaN for goodness of fit matlab

1.2k views Asked by At

Hi have a set of observations obs = https://drive.google.com/file/d/0B3vXKJ_zYaCJVlhqd3FJT0xtWFk/view?usp=sharing

I would like to prove that they come from a Gamma distributions.

To do that I:

%estimate parameters gamma distribution    
paramEsts_gamma = gamfit(obs);   
% estimate cdf gamma distribution (objects)
gamma_cdf=makedist('Gamma','a',paramEsts_gamma(1),'b',paramEsts_gamma(2));

% test with kstest if data comes from a gamma distribution
    [h_gamma_ks,p_gamma_ks,kstat_gamma_ks,cv_gamma_ks] = kstest(obs,'CDF',gamma_cdf)

% test with chi2gofif data comes from a gamma distribution
    pd_gamma = fitdist(obs,'Gamma');
    [h_gamma_chi,p_gamma_chi,st_gamma_chi] = chi2gof(obs,'CDF',pd_gamma)

My problem is that I get NaN for the pvalue p_gamma_chi.... Where do I make a mistake? Thanks

Here some code to check visually the distributions

%% Plot cdf
% empirical cdf
[f_emp,x_values] = ecdf(obs);
f_gamma = gamcdf(x_values,paramEsts_gamma(1),paramEsts_gamma(2));

     figure
     hold on;
     F = plot(x_values,f_emp);
     set(F,'LineWidth',2);

     G = plot(x_values,f_gamma,'r-');
     set(G,'LineWidth',2);


     legend([F G],...
        'Empirical CDF','Gamma CDF',...
        'Location','SE');
1

There are 1 answers

0
rozsasarpi On

As the output of your code shows st_gamma_chi.df = 0, which means 0 degrees-of-freedom (dof).

dof = N - n - 1

where:
N is the number of frequencies, in your case N = length(st_gamma_chi.edges)-1 = 3;
n is the number of fitted parameters, in your case n = 2.

Thus you get 0 dof with the default options, you can ameliorate this issue for example by increasing the number of bins where the frequencies are calculated:

[h_gamma_chi,p_gamma_chi,st_gamma_chi] = chi2gof(obs,'CDF',pd_gamma, 'NBins', 20)

But this will not exempt you from understanding chi-squared test.