Why does SciPy minimize return different solutions when minimizing sum of squared error versus root mean squared error?

60 views Asked by At

I am in the process of fitting a curve to data using scipy.optimize.minimize. To do this, I have defined an objective function which returns either the sum of squared error or the root mean squared error, depending on what I specify as the output. My belief is that this metric should not have a significant impact on the minima which is located as the RMSE should lead to a scaled version of the same minima as with SSE, however this is not the case (see attached image). To ensure the difference in minima is not a result of the minimization method, I have manually defined my minimization method as BFGS (one of the three defaults of the minimize function). It should be noted that when I use SSE as my objective function metric, I receive three run time warnings, pasted here:

C:\Users\Admin\AppData\Local\Temp\ipykernel_33104\472201149.py:23: RuntimeWarning: overflow encountered in square SSE = np.sum(np.abs(y - y_pred)**2)

C:\Users\Admin\AppData\Roaming\Python\Python39\site-packages\scipy\optimize_numdiff.py:576: RuntimeWarning: invalid value encountered in subtract df = fun(x) - f0

C:\Users\Admin\AppData\Local\Temp\ipykernel_33104\472201149.py:23: RuntimeWarning: overflow encountered in square SSE = np.sum(np.abs(y - y_pred)**2)

Below is an example of my code.

import numpy as np
from scipy.optimize import minimize

# Generate data
A = -6.899                      # Parameter 1 (scale)
B = 0.0221                      # Parameter 2 (concavity)
C = 9.909                       # Parameter 3 (intercept)

x = np.linspace(0, 100, 101)    # A bunch of evenly spaced points

y = A*np.exp(B*x) + C

# Function to calculate SSE or RMSE between function and data
def loss_func(parameters, x, y):
    A, B, C = parameters

    y_pred = A*np.exp(B*x) + C
    SSE = np.sum(np.abs(y - y_pred)**2)
    RMSE = np.sqrt(SSE / len(y))
    return RMSE                    # Change this to SSE or RMSE as desired

guess = [1, 1, 1]

# Optimize the function
sol = minimize(loss_func, guess, args=(x, y), method = 'BFGS', bounds=None, constraints=None)
print("Optimized parameters:", sol.x)

Does anyone know why I am observing a difference in the tuned parameters resulting in a minima to my objective function depending on the applied metric? Looking for answers that may explain any errors I may be making, nuances in scipy.optimize.minimize, or the differences between SSE and RMSE that can cause differences in the located minima.

Left: Plot of fitted curve using RMSE in objective function. Right: Plot of fitted curve using SSE in objective function.

Edit:

After further investigation into the solution of my minimization, I observed the following when using SSE as my objective function metric, indicating that convergence is not reached at all in this case. However, using RMSE as my objective function metric also results in a "success: False" message. My question now becomes, why am I not reaching a succesfull solution, and why does using RMSE still provide results closer to the target than using SSE?

      fun: 54279.23221060659
 hess_inv: array([[ 9.99899697e-01, -1.00146569e-02,  2.07889631e-88],
       [-1.00146569e-02,  1.00303413e-04,  2.07564554e-86],
       [ 2.07889631e-88,  2.07564554e-86,  1.00000000e+00]])
      jac: array([  1668.19189453, 117747.58496094,   3464.14257812])
  message: 'Desired error not necessarily achieved due to precision loss.'
     nfev: 34
      nit: 1
     njev: 6
   status: 2
  success: False
        x: array([ 0.98988469, -0.00994935,  1.        ])
1

There are 1 answers

1
lastchance On

You are starting with an extremely poor initial guess. If A=B=C=1 then, for x=100 you are predicting a y value of exp(100)+1. My calculator didn't like that at all.

For the same x value you are comparing with a data value of -6.899.exp(2.21)+9.909, which is -52.98

Then you are squaring this enormous difference. And adding similar ones.

Start with initial guess [0,0,0] and both methods will reproduce your original values of A,B,C.

Both the initial guess and bounds play an important role in whether your minimisation works. Unfortunately, both are problem dependent.