NaN when using constrained minimization in Matlab?

859 views Asked by At

I am trying to maximize a function of 10 variables. I do this by the following piece of code:

b    = [1 2 3 4 1 7 -1 -4 9 1].';     %// initial value of the decision variable
goal = @(b) -sum(F(b));

Because I want to maximize F for a given set of parameters b, of which the last value should be ≥ 0. I implemented the optimization as follows:

%// lower bounds for decision variable
lb = [-inf(1,9) 0];

%// upper bounds for decision variable
ub = inf(1,10);

myfminoptions = optimset('Display','Iter', 'Algorithm','active-set');
theta1 = fmincon(goal, theta0, [],[],[],[], lb,ub, [], myfminoptions);

That is, I find the minimum of the negative of the function, which should be the same as finding the maximum of that function.

My problem is, that for every iteration I get that the first-order optimality is Inf and thus 'Hessian not updated'.

Additionally the value of my function NaN in the first iteration, which I simply don't understand. This might be the cause, but I not sure.

Edit

This is the function that I am using:

function ll = mytobit(theta); 
global x y; 
b=theta(1:size(theta,1)-1); 
s=theta(size(theta,1)); 
ll = (y==0).*log(1-normcdf(x*b/s))+ (y>0).*(-0.5*(log(2*pi)+log(s^2)+(y-x*b).^2/s^2)); 
return;

When calling the function for the theta0, I get N likelihood values (N is the number of rows in X and Y) which I am supposed to.

An example of Y and X:

    Y = [0;0;2047;1890;1975]

    X = [2300, 34, 1156, 0, 1;
         2100, 35, 1225, 0, 1;
         2760, 36, 1296, 1, 0;
         2300, 37, 1369, 1, 0
         2455, 38, 1444, 0, 0]
1

There are 1 answers

0
Matt J On

You should probably be using the 'sqp' algorithm, as opposed to active-set. The sqp algorithm can recover from NaNs produced when s strays to close to zero, where the derivatives are undefined.

Also, you would hopefully not be initializing the s variable near zero in theta0.

Because s is actually supposed to be strictly greater than zero, it may help to re-express it as s=exp(-d/2) where d is a new unconstrained unknown variable. Also, your objective function only depends on b through the quantity B=b/s. So, you could formulate this way

ll = (y==0).*log(1-normcdf(x*B))+ (y>0).*(-0.5*(log(2*pi)-d+(y*exp(-d/2)-x*B).^2));

Now there are no ill-defined regions in the function or its derivatives.