Numeric and symbolic gradients don't match although Hessians do

156 views Asked by At

For context, I have a small project in MATLAB where I try to replicate an algorithm involving some optimisation with the Newton algorithm. Although my issue is mainly with MATLAB, maybe it's my lacking profound background knowledge what's keeping me from finding a solution, so feel free to redirect me to the appropriate StackExchange site if needed.

The function I need to calculate the gradient vector and Hessian matrix for the optimization is :

function [zi] = Zi(lambda,j)
    zi = m(j)*exp(-(lambda*v_tilde(j,:).'));
end

function [z] = Z(lambda)
    res = arrayfun(@(x) Zi(lambda,x),1:length(omega));
    z = sum(res);

end

function [f] = F(lambda)
    f = log(Z(lambda));
end

where omega and v_tilde are Matrices of n d-Dimensional vectors and lambda is the d-Dimensional argument to the function. (right now, m(j) are just selectors (1 or 0), but the algorithm allows to refine these, so they shouldn't be removed. I use the Derivest Suite to calculate the gradient and Hessian numerically, and, although logically slow for high dimensions, the algorithm as a whole works. I implemented the same solution using the sym package, so that I could compute the gradient and Hessian in advance for some fix n and d, so they can then be evaluated quickly when needed. This would be the symbolic version:

V_TILDE = sym('v_tilde',[d,n]) 
syms n k

lambda = sym('lambda',[d,1]); 
F = log(M*exp(-(transpose(V_TILDE)*lambda)));

matlabFunction( grad_F, 'File', sprintf('Grad_%d_dim_%d_n.m',d,n_max),  'vars',{a,lambda,V_TILDE});
matlabFunction( hesse_F, 'File', sprintf('Hesse_%d_dim_%d_n.m',d,n_max), 'vars',{a,lambda,V_TILDE});

As n is fix, there is no need to iterate over omega. The gradient and Hessian of this can be obtained through the corresponding functions of sym and then stored as matlabFunctions.

However, when I test both implementations against some values, they don't match, surprisingly though, the values of the hessian matrix match while the values of the gradient don't (the numerical calculation being correct), and the Newton algorithm iterates until the values are just NaN. These are some example values for d=2 and n=8:

Omega:
12.6987   91.3376
95.7507   96.4889
15.7613   97.0593
95.7167   48.5376
70.6046    3.1833
27.6923    4.6171
9.7132   82.3458
95.0222    3.4446

v:
61.2324
52.2271

gNum =          HNum =    1.0e+03 *

  8.3624                 1.4066   -0.5653
 -1.1496                -0.5653    1.6826

gSym =          HSym =    1.0e+03 *

 -52.8700                1.4066   -0.5653
 -53.3768               -0.5653    1.6826

As you can see, the values of HNum and HSym match, but the gradients don't.

I'm happy to give any more context information, code snippets, or anything that helps. Thank you in advance!

Edit: As requested, here is a minimal test. The problem is basically that the values of gNum and gSym don't match (longer explanation above):

omega = [[12.6987,   91.3376];[95.7507, 96.4889];[15.7613, 97.0593];
[95.7167, 48.5376];[70.6046, 3.1833];[27.6923, 4.6171];[9.7132,  82.3458];
[95.0222, 3.4446]];

v = [61.2324;52.2271];

gradStr = sprintf('Grad_%d_dim_%d_n',length(omega(1,:)),length(omega));
hesseStr = sprintf('Hesse_%d_dim_%d_n',length(omega(1,:)),length(omega));
g = str2func(gradStr);
H = str2func(hesseStr);
selector = ones(1,length(omega)); %this will change, when n_max>n

vtilde = zeros(length(omega),length(omega(1,:)));
for i = 1:length(omega)
    vtilde(i,:) = omega(i,:)-v;
end

lambda = zeros(1,length(omega(1,:))); % start of the optimization

[gNum,~,~] = gradest(@F,lambda)
[HNum,~] = hessian(@F,lambda)
gSym = g(selector,lambda.',omega.')
HSym = H(selector,lambda.',omega.')

Note: The DerivestSuite is a small library (~6 source files) that can be obtained under https://de.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation

0

There are 0 answers