How to use the smoothed line with patsy cr in production?

237 views Asked by At

I smooth a set of features using patsy cr (with natural splines) however confused with something looking very basic. Here is a sample raw data points and corresponding smoothed points by patsy.

x = df[feature]
y = np.log(df['varTarget'])
    
x_val = 100
#y_val = np.log(df_val['varTarget'])        

x_basis = cr(x, df=10, constraints="center", lower_bound=x.min(), upper_bound=x.max())    
x_basis_val = cr(x_val, df=10, constraints="center", lower_bound=x.min(), upper_bound=x.max())

# Fit model to the data
# this model uses an input x_basis with 10 columns created through cr
model = LinearRegression().fit(x_basis, y)

# Get estimates
y_hat = model.predict(x_basis)
y_hat_val = model.predict(x_basis_val)

plt.figure(figsize=(17,7))
plt.scatter(x, y, s=4, color="tab:blue")
plt.scatter(x, y_hat, s=8, color="tab:red")

and the plot:

enter image description here

So the linear regression model based on the smoothed points expects an input with 10 columns. This is created by cr. So suppose in production I have a new x = 100. Then how can I get a smoothed value for the new x relying on the smoothed line already created?

When trying with one value I get the following:

Unable to compute n_inner_knots(=4) + 2 distinct knots: 1 data value(s) found between lower_bound(=30.023212890625) and upper_bound(=998.42234375).

0

There are 0 answers