I'm running gradient descent to find a root for a system of nonlinear equations and I am wondering how you might detect if the method is stuck at the local minima, because I believe with the settings I am using this might be the case? my initial values are [-2, -1], tolerance of 10^-2 and 20 iterations. One thing I had read upon was that if the residual begins to flat line or begins to decrease incredibly slowly, it could be an indicator of the method being stuck in the local minima though, I am not entirely sure. I have graphed my residual with its iteration as the values of my iterates for each iteration and I'm wondering how I might know if it's stuck at the local minima.
def system(x):
F = np.zeros((2,1), dtype=np.float64)
F[0] = x[0]*x[0] + 2*x[1]*x[1] + math.sin(2*x[0])
F[1] = x[0]*x[0] + math.cos(x[0]+5*x[1]) - 1.2
return F
def jacb(x):
J = np.zeros((2,2), dtype=np.float64)
J[0,0] = 2*(x[0]+math.cos(2*x[0]))
J[0,1] = 4*x[1]
J[1,0] = 2*x[0]-math.sin(x[0]+5*x[1])
J[1,1] = -5*math.sin(x[0]+5*x[1])
return J
iterates, residuals = GradientDescent('system', 'jacb', np.array([[-2],[-1]]), 1e-2, 20, 0);
FullGradientDescent.py GradientDescentWithMomentum
I'm testing usually with 20 iterations but I did 200 to illustrate the slowing down of the residual
Marat suggested using GD with momentum. Code changes:
dn = 0
gamma = 0.8
dn_prev = 0
while (norm(F,2) > tol and n <= max_iterations):
J = eval(jac)(x,2,fnon,F,*fnonargs)
residuals.append(norm(F,2))
dn = gamma * dn_prev+2*(np.matmul(np.transpose(J), F))
dn_prev = dn
lamb = 0.01
x = x - lamb * dn
Residual using GD with momentum
lastchance suggested doing a contour plot, this seems to show the behaviour of the algorithm but it still does not converge?
Rewrite your system as follow:
Create some room to plot contour around minima:
Solve the system around potential solution you have identified:
Confirm returned points actually are solutions:
Plot the whole thing to confirm solutions:
The first point
(0,0)
is not a solution, solver fails:Based on documentation:
Based on those information, you can confirm your code did find one of the two roots of your system over the domains, the additional critical point at
(0,0)
is not solution.