I'm trying to compute a numerical subgradient of a convex function. My test subject is the Wolfe function. It doesn't need to be super accurate, so I tried a normal finite differential in both directions: (f(x-h)-f(x+h))/2h. In code:
delta = 1e-10;
subgradient = zeros(length(xToEvaluate),1);
for i = 1 : length(xToEvaluate)
deltaX = xToEvaluate;
deltaX(i) = xToEvaluate(i) + delta;
f1 = funct( deltaX );
deltaX(i) = xToEvaluate(i) - delta;
f2 = funct( deltaX );
subgradient(i,1) = (f1 - f2) / (2 * delta);
end
At the exact minimum of the function, at (-1 ,0), I get some things at the magnitude 1e-7
, so perfectly fine. As I move to something like (-1, 0.1) or (-1, 1e-6), I get a subgradient with second component of about 16
.
I'm aware that low deltas might introduce rounding errors, but it doesn't get better as I increase delta.
My second try was a one-dimensional five-point stencil, but even with deltas of around 1e-3
the weird 16
keeps popping up...
delta = 1e-3;
subgradient = zeros(length(xToEvaluate),1);
for i = 1 : length(xToEvaluate)
xPlusTwo = xToEvaluate;
xPlusOne = xToEvaluate;
xMinusTwo = xToEvaluate;
xMinusOne = xToEvaluate;
xPlusTwo(i) = xToEvaluate(i) + 2*delta;
xPlusOne(i) = xToEvaluate(i) + delta;
xMinusTwo(i) = xToEvaluate(i) - 2*delta;
xMinusOne(i) = xToEvaluate(i) - delta;
subgradient(i,1) = (-funct(xPlusTwo) + 8*funct(xPlusOne) - 8*funct(xMinusOne) + funct(xMinusTwo)) / (12*delta);
end
Anyone got an idea what this is all about?
If you work out the gradient of that Wolfe function, you come up with:
So as you can see, the second component of the gradient for
x<=0
is16*sign(y)
, thus it is zero wheny==0
,+-16
otherwise.BTW, it doesn't look like the exact minimum lies at
[-1 0]
, but rather at[-0.7598 0]
= [-(1/9)^(1/8) 0]