Logistic regression: ValueError: Unknown label type: 'continuous'

6k views Asked by At

I want to use logistic regression to predict and plot a curve from an Excel dataset and get its slope coefficients. However, when I run the code (see below) the error " ValueError: Unknown label type: 'continuous'. " occurs.

I read in similar questions that the y values should be 'int' type but I don't want to convert it because the y numbers are between 1.66 and 0.44...

Is there a solution for this kind of cases or should I try another regression model?

Thanks a lot in advance

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
import seaborn as sns
from sklearn.linear_model import LogisticRegression


df = pd.read_excel('Fatigue2.xlsx',sheet_name='Sheet4')

X = df[['Strain1', 'Temperature1']]
y = df['Cycles1']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)

#poly = PolynomialFeatures(degree=2)
#X_ = poly.fit_transform(X_train)

LR = LogisticRegression()
LR.fit(X_train,y_train)

g = sns.lmplot(x='Cycles1', y='Strain1', hue = 'Temperature1', data=df, fit_reg= False)
g.set(xscale='log', yscale ='log')
g.set_axis_labels("Cycles (log N)", "Strain")

print ('Coefficients : ', LR.coef_, 'Intercept :', LR.intercept_)

About the data, I have 97 values in total in an Excel sheet:

Cycles1   Strain1    Temperature1

27631     1.66         650
...       ...          650
6496220   0.44         650
2

There are 2 answers

1
amaraz On BEST ANSWER

LogisticRegression from sklearn is a classifier, i.e. it expects that the response variable is categorical.

Your task is of regression. Moreover, the plot does not seem to have the asymptotic behavior of a logit on the right. You may have better results using a polynomial regression as described here.

0
Ish Beniwal On

Based on docs type_of_target(y):

Determine the type of data indicated by the target.

Note that this type is the most specific type that can be inferred. For example:

  • binary is more specific but compatible with multiclass.
  • multiclass of integers is more specific but compatible with continuous.
  • multilabel-indicator is more specific but compatible with multiclass-multioutput.

Parameters

y : array-like

Returns

target_type : string

One of:

  • 'continuous': y is an array-like of floats that are not all integers, and is 1d or a column vector.
  • ...

change y as y.astype(int)