How do I do Polynomial regression right on difficult data?

42 views Asked by At

I'm having issues with my polynomial regression. For some reason my line gets draw wrong. I've looked at the other similar issues on stack, but can not find a solution that works for me. I'm working with price (y) and number of reviews (X).

Has anyone encountered something similar or successfully done polynomial regression on similar looking data?

Here is my code:

# Defining independent and dependent variables
X = cph_listings_df[['reviews_per_month']].values.reshape(-1, 1)
y = cph_listings_df['price'].values.reshape(-1, 1)

# split into test and training
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Fitting Linear Regression to the dataset
lin_reg = LinearRegression()
lin_reg.fit(X, y)

# Fitting Polynomial Regression to the dataset
poly_model = PolynomialFeatures(degree=2)
X_poly = poly_model.fit_transform(X) # transformer X to poly features
pol_reg = LinearRegression() # instance of lineargression
pol_reg.fit(X_poly, y) # train model
y_predict = pol_reg.predict(X_poly) # predict on trained model
plt.scatter(X, y, color='red') # red = actual data points
    plt.plot(X, y_predict , color='blue') 
    plt.title('Polynomia Regression)') 
    plt.xlabel('Number of reviews')  
    plt.ylabel('Price')
    plt.show()

Here is the graph:

enter image polynominal

Here is what my teacher told me I need to get (line drawn by myself):

enter image draeam

I've tried to sort the X values before running the code, but nothing changed for me.

0

There are 0 answers