I am trying to maximize a function of 10 variables. I do this by the following piece of code:
b = [1 2 3 4 1 7 -1 -4 9 1].'; %// initial value of the decision variable
goal = @(b) -sum(F(b));
Because I want to maximize F
for a given set of parameters b
, of which the last value should be ≥ 0. I implemented the optimization as follows:
%// lower bounds for decision variable
lb = [-inf(1,9) 0];
%// upper bounds for decision variable
ub = inf(1,10);
myfminoptions = optimset('Display','Iter', 'Algorithm','active-set');
theta1 = fmincon(goal, theta0, [],[],[],[], lb,ub, [], myfminoptions);
That is, I find the minimum of the negative of the function, which should be the same as finding the maximum of that function.
My problem is, that for every iteration I get that the first-order optimality is Inf
and thus 'Hessian not updated'
.
Additionally the value of my function NaN
in the first iteration, which I simply don't understand. This might be the cause, but I not sure.
Edit
This is the function that I am using:
function ll = mytobit(theta);
global x y;
b=theta(1:size(theta,1)-1);
s=theta(size(theta,1));
ll = (y==0).*log(1-normcdf(x*b/s))+ (y>0).*(-0.5*(log(2*pi)+log(s^2)+(y-x*b).^2/s^2));
return;
When calling the function for the theta0, I get N likelihood values (N is the number of rows in X and Y) which I am supposed to.
An example of Y and X:
Y = [0;0;2047;1890;1975]
X = [2300, 34, 1156, 0, 1;
2100, 35, 1225, 0, 1;
2760, 36, 1296, 1, 0;
2300, 37, 1369, 1, 0
2455, 38, 1444, 0, 0]
You should probably be using the 'sqp' algorithm, as opposed to active-set. The sqp algorithm can recover from NaNs produced when s strays to close to zero, where the derivatives are undefined.
Also, you would hopefully not be initializing the s variable near zero in theta0.
Because s is actually supposed to be strictly greater than zero, it may help to re-express it as s=exp(-d/2) where d is a new unconstrained unknown variable. Also, your objective function only depends on b through the quantity B=b/s. So, you could formulate this way
Now there are no ill-defined regions in the function or its derivatives.