For context, I have a small project in MATLAB where I try to replicate an algorithm involving some optimisation with the Newton algorithm. Although my issue is mainly with MATLAB, maybe it's my lacking profound background knowledge what's keeping me from finding a solution, so feel free to redirect me to the appropriate StackExchange site if needed.
The function I need to calculate the gradient vector and Hessian matrix for the optimization is :
function [zi] = Zi(lambda,j)
zi = m(j)*exp(-(lambda*v_tilde(j,:).'));
end
function [z] = Z(lambda)
res = arrayfun(@(x) Zi(lambda,x),1:length(omega));
z = sum(res);
end
function [f] = F(lambda)
f = log(Z(lambda));
end
where omega and v_tilde are Matrices of n d-Dimensional vectors and lambda is the d-Dimensional argument to the function. (right now, m(j) are just selectors (1 or 0), but the algorithm allows to refine these, so they shouldn't be removed.
I use the Derivest Suite to calculate the gradient and Hessian numerically, and, although logically slow for high dimensions, the algorithm as a whole works.
I implemented the same solution using the sym package, so that I could compute the gradient and Hessian in advance for some fix n and d, so they can then be evaluated quickly when needed. This would be the symbolic version:
V_TILDE = sym('v_tilde',[d,n])
syms n k
lambda = sym('lambda',[d,1]);
F = log(M*exp(-(transpose(V_TILDE)*lambda)));
matlabFunction( grad_F, 'File', sprintf('Grad_%d_dim_%d_n.m',d,n_max), 'vars',{a,lambda,V_TILDE});
matlabFunction( hesse_F, 'File', sprintf('Hesse_%d_dim_%d_n.m',d,n_max), 'vars',{a,lambda,V_TILDE});
As n is fix, there is no need to iterate over omega. The gradient and Hessian of this can be obtained through the corresponding functions of sym and then stored as matlabFunctions.
However, when I test both implementations against some values, they don't match, surprisingly though, the values of the hessian matrix match while the values of the gradient don't (the numerical calculation being correct), and the Newton algorithm iterates until the values are just NaN. These are some example values for d=2 and n=8:
Omega:
12.6987 91.3376
95.7507 96.4889
15.7613 97.0593
95.7167 48.5376
70.6046 3.1833
27.6923 4.6171
9.7132 82.3458
95.0222 3.4446
v:
61.2324
52.2271
gNum = HNum = 1.0e+03 *
8.3624 1.4066 -0.5653
-1.1496 -0.5653 1.6826
gSym = HSym = 1.0e+03 *
-52.8700 1.4066 -0.5653
-53.3768 -0.5653 1.6826
As you can see, the values of HNum and HSym match, but the gradients don't.
I'm happy to give any more context information, code snippets, or anything that helps. Thank you in advance!
Edit: As requested, here is a minimal test. The problem is basically that the values of gNum and gSym don't match (longer explanation above):
omega = [[12.6987, 91.3376];[95.7507, 96.4889];[15.7613, 97.0593];
[95.7167, 48.5376];[70.6046, 3.1833];[27.6923, 4.6171];[9.7132, 82.3458];
[95.0222, 3.4446]];
v = [61.2324;52.2271];
gradStr = sprintf('Grad_%d_dim_%d_n',length(omega(1,:)),length(omega));
hesseStr = sprintf('Hesse_%d_dim_%d_n',length(omega(1,:)),length(omega));
g = str2func(gradStr);
H = str2func(hesseStr);
selector = ones(1,length(omega)); %this will change, when n_max>n
vtilde = zeros(length(omega),length(omega(1,:)));
for i = 1:length(omega)
vtilde(i,:) = omega(i,:)-v;
end
lambda = zeros(1,length(omega(1,:))); % start of the optimization
[gNum,~,~] = gradest(@F,lambda)
[HNum,~] = hessian(@F,lambda)
gSym = g(selector,lambda.',omega.')
HSym = H(selector,lambda.',omega.')
Note: The DerivestSuite is a small library (~6 source files) that can be obtained under https://de.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation