Fitting a pdf to an histogram in matlab

912 views Asked by At

I'm having troubles when fitting a pdf to an histogram in Matlab. I'm using gmdistribution.fit because my data is multi-modal. This is what I have done:

data=[0.35*randn(1,100000), 0.5*randn(1,100000)+5, 1*randn(1,100000)+3]'; %multimodal data
x=min(data):(max(data)-min(data))/10000:max(data);

%Normalized Histogram
[counts,edges]=histcounts(data,500, 'Normalization', 'pdf');
bw=edges(2)-edges(1);
centers=edges(1:end-1)+bw;
H = bar(centers,counts,'hist');
hold on

%Fitting with gmdistribution
rng default
obj=gmdistribution.fit(data,3,'Replicates',5);

%the PDF
PDF=zeros(1,length(x));
for i=1:obj.NumComponents
    k=obj.ComponentProportion(i);
    u=obj.mu(i);
    sigma=obj.Sigma(i);
    PDF=PDF+k*normpdf(x,u,sigma);    
end
PDF=PDF/trapz(x,PDF);  %normalization (just in case)
plot(x,PDF)

%Fitting with ksdensity (for comparison)
[PDF2,xi]=ksdensity(data,x);
plot(x,PDF2)

legend('Normalized Histogram','gmdistribution','ksdensity')

Histogram and PDFs

As you can see, the Gaussian Mixture doesn't fit the histogram properly. The PDF from the ksdensiti function is much better. I have also tried to fit just one gaussian. If you run the same previous code, using data=[0.35*randn(1,100000)]'; and obj=gmdistribution.fit(data,1,'Replicates',5); you get the following

Histogram and PDFs for one gaussian

Again, the pdf from gmdistribution doesn't fit the histogram. It seems that the problem is with the scaling factor in the data generation (the 0.35). What am I doing wrong?

1

There are 1 answers

0
Jorge On BEST ANSWER

The Sigma parameter of the gmdistribution object corresponds to the covariance, however, the normpdf function needs the standard deviation. The problem is fixed by replacing normpdf(x,u,sigma) with normpdf(x,u,sqrt(sigma)) in the for loop.