Why do I need to use ravel() in this case?

1.1k views Asked by At

I am really confused about why do I need to use ravel() before fitting the data to SGDRegressor.

This is the code:

from sklearn.linear_model import SGDRegressor
sgd_reg = SGDRegressor(max_iter = 1000, tol = 1e-3, penalty = None, eta0= 0.1)
sgd_reg.fit(X, y.ravel())

These are the shape of X and y:

>>> X.shape
(100, 1)

>>> y.shape
(100, 1)

>>> y.ravel().shape
(100,)

1

There are 1 answers

0
Arne On BEST ANSWER

Think of y as a two-dimensional matrix, although it has only one column. But the fit method expects y to be a flat array. That's why you have to use ravel, to convert the 2d to a 1d array.

It's common in machine learning papers and textbooks to write y as a matrix, because it can simplify the notation when matrices are multiplied. But you could also write it as a simple one-dimensional vector. You could say it makes no difference, because it really only has one dimension in either case, but mathematically and in the Python implementation, the matrix and the vector are two different objects.