Polynomial Regression

← Back to Glossary

What is Polynomial Regression?

Polynomial Regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modeled as an nth-degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y | x).

Why use Polynomial Regression?

Polynomial regression is used when the relationship between the independent and dependent variables is not linear, and a simple linear regression model does not fit the data well. By using higher-degree polynomials, the model can capture more complex relationships between the variables.

Polynomial Regression example:

Here’s a simple example of how to perform polynomial regression using Python and the scikit-learn library:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# Generate sample data
x = np.random.rand(20, 1)
y = 2 * (x ** 3) - 6 * (x ** 2) + 3 * x + np.random.randn(20, 1) * 0.1

# Transform the data to include polynomial features
poly_features = PolynomialFeatures(degree=3, include_bias=False )
x_poly = poly_features.fit_transform(x)

# Perform linear regression on the transformed data
lin_reg = LinearRegression()
lin_reg.fit(x_poly, y)

# Visualize the polynomial regression fit
plt.scatter(x, y, color='blue')
x_new = np.linspace(0, 1, 100).reshape(100, 1)
x_new_poly = poly_features.transform(x_new)
y_new = lin_reg.predict(x_new_poly)
plt.plot(x_new, y_new, color='red', linewidth=2)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Polynomial Regression')
plt.show()

In this example, we generate some sample data with a cubic relationship, add polynomial features to the dataset, and then perform linear regression on the transformed data. The resulting plot shows the original data points and the fitted polynomial curve.

Resources: