Performing a simple ridge regression

Question

22 views Asked by Henry T. At 23 March 2024 at 19:05

Im trying to find the minimizing w \in \mathbb{R}^n in:

\min{w} |Aw-y|^2 + \lambda |w|^2

For A \in \mathbb{R}^{m \times n} and y \in \mathbb{R}^{m}

According to Wikipedia the explicit solution is given by:

w = (X^TX + \lam 1)^{-1} X^T y

So I implemented this in python:

w = np.linalg.solve(X.T@X+lam*np.eye(n), X.T@y)

But this somehow gives me a vastly different w than the professional implementation:

from sklearn.linear_model import Ridge
clf = Ridge(alpha=lam)
clf.fit(X,y)
w = clf.coef_

Can someone quickly spot what I am doing wrong, thanks!

There are 1 answers

**Taha Akbari** · Accepted Answer · 2024-03-23T23:57:56+00:00

Note that sklearn Ridge class has by default fit_intercept set to True which means it is trying to fit a model of type:

w_1x_1 + ... + w_nx_n + b

Where the mathematical solution you provided lacks the extra b term in it's computation, so you will have different results.

In order to fix it there are two options:

clf = Ridge(alpha=alpha, fit_intercept=False)

2.Add and extra column of 1's to the feature matrix i.e.:

X = np.concatenate([X, np.ones((m, 1))], axis=1)

Trying each will results in the same result for the mathematical formula and sklearn implemented Ridge Regression.