I want to use logistic regression to predict and plot a curve from an Excel dataset and get its slope coefficients. However, when I run the code (see below) the error " ValueError: Unknown label type: 'continuous'. " occurs.
I read in similar questions that the y values should be 'int' type but I don't want to convert it because the y numbers are between 1.66 and 0.44...
Is there a solution for this kind of cases or should I try another regression model?
Thanks a lot in advance
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
import seaborn as sns
from sklearn.linear_model import LogisticRegression
df = pd.read_excel('Fatigue2.xlsx',sheet_name='Sheet4')
X = df[['Strain1', 'Temperature1']]
y = df['Cycles1']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)
#poly = PolynomialFeatures(degree=2)
#X_ = poly.fit_transform(X_train)
LR = LogisticRegression()
LR.fit(X_train,y_train)
g = sns.lmplot(x='Cycles1', y='Strain1', hue = 'Temperature1', data=df, fit_reg= False)
g.set(xscale='log', yscale ='log')
g.set_axis_labels("Cycles (log N)", "Strain")
print ('Coefficients : ', LR.coef_, 'Intercept :', LR.intercept_)
About the data, I have 97 values in total in an Excel sheet:
Cycles1 Strain1 Temperature1
27631 1.66 650
... ... 650
6496220 0.44 650
LogisticRegression
fromsklearn
is a classifier, i.e. it expects that the response variable is categorical.Your task is of regression. Moreover, the plot does not seem to have the asymptotic behavior of a logit on the right. You may have better results using a polynomial regression as described here.