Mutual information between class labels and features. MATLAB

786 views Asked by At

I hope you can help me in find the problem here. I want to implement the algorithm Mutual Information-based best individual feature. As part of the algorithm I need to calculate the Mutual Information (MI) between each n dimentional feature vector f_j with each class label w={1,2}, which can be written as:

I(f_j:w)=H(w)-H(w|f_j),

where H(w)=-sum P(w)log2P(w) and the conditional entropy is

H(w|f_j)=-sum{w=1:2} p(w|f_j)log2p(w|f_j)= -sum{w=1:2}sum{i=1:n}p(w|f_j,i)log2p(w|f_j,i).

Then the p(w|f_j,i) can be computes by Bayes:

p(w|f_j,i)= (p(f_j,i|w)P(w))/p(f_j,i),

where p(f_j,i)=sum(w=1:2)p(f_j,i|w)P(w).

The conditional probability can be estimated by using Parzen Windows.

So, I have implemented the following code in MATLAB by using KSDENSITY for the estimation of the conditional probability.

I need to know if it is WELL implemented since I am having values for the MI greater than 1, which have no sense, since the H(w)=1, (see here) then the upper bound of the MI should also be 1, right?

Please, any error that could in the code that you find could be of great help-

Best,

N=length(labels);
nf=size(features,2);
nC1=sum(labels);
nC2=N-nC1;
priors=[nC1 nC2]/N;

%class entropy
Hw=-sum(priors.*log2(priors));
%conditional probability p(fji|w), 
%where fji is the jth feature value at ith trial
%Estimation based on gaussian parzen windows.
target=features(labels==1,:); %fj |w==1 (in columns)
nontarget=features(labels~=1,:);%fj |w~=1 (in columns)

for j=1:nf
 for i=1:N

cp1 = ksdensity(target(:,j),features(i,j));
cp2= ksdensity(nontarget(:,j),features(i,j));

%% feature probability p(fji) or marginal likelihood
pf=sum(cp1*priors(1)+cp2*priors(2));
%% conditional probability by bayes rules
pwf1=(cp1*priors(1))/pf;
pwf2=(cp2*priors(2))/pf;
%% conditional Entropy

Hwf=-sum(pwf1*log2(pwf1+eps)+ pwf2*log2(pwf2+eps));
%% Mutual Inf
dMI(i)=Hw-Hwf;
       end
MI(j)=nansum(dMI(:));
end
0

There are 0 answers