How to perform linear regression with numpy.polyfit and print error statistics?

Question

How to perform linear regression with numpy.polyfit and print error statistics?

5.9k views Asked by funkymickey At 19 August 2020 at 18:03

I am figuring out how to use the np.polyfit function and the documentation confuses me. In particular, I am trying to perform linear regression and print related statistics like the sum of squared errors (SSE). Can someone provide clear and concise explanations, possibly with a minimal working example?

Original Q&A

There are 1 answers

**Dave** · Accepted Answer · 2020-08-19T21:03:48+00:00

np.polyfit returns a tuple containing the coefficients parametrizing the best-fitting polynomial of degree deg. To fit a line, use deg = 1. You can return the residual (sum of squared errors) by passing full = True as an argument to polyfit. Note that with this argument, polyfit will also return some other information about the fit, which we can just discard.

Altogether, then, we have might have something like

import matplotlib.pyplot as plt
import numpy as np

# Generate some toy data.
x = np.random.rand(25)
y = 2 * x + 0.5 + np.random.normal(scale=0.05, size=x.size)

# Fit the trend line.
(m, b), (SSE,), *_ = np.polyfit(x, y, deg=1, full=True)

# Plot the original data.
plt.scatter(x, y, color='k')

# Plot the trend line.
line_x = np.linspace(0, 1, 200)
plt.plot(line_x, m * line_x + b, color='r')

plt.title(f'slope = {round(m, 3)}, int = {round(b, 3)}, SSE = {round(SSE, 3)}')
plt.show()

The *_ notation in the call to polyfit just tells Python to discard however many additional values are returned by the function. The documentation can tell you about these extra values if you're interested. We have to parse the SSE as a tuple (SSE,) because polyfit returns it as a singleton array. This code produces something like this plot.

You might also like to know about np.polyval, which will take tuples of polynomial coefficients and evaluate the corresponding function at input points.

TechQA.

How to perform linear regression with numpy.polyfit and print error statistics?

There are 1 answers

Related Questions in NUMPY

Related Questions in BEST-FIT

Popular Questions

Trending Questions