Linear Regression

What is Linear Regression?

Linear Regression is a statistical method that models the relationship between a dependent variable (y) and one or more independent variables (X). In simple linear regression, there is only one independent variable. In multiple linear regression, there are multiple independent variables. Linear Regression aims to find the best-fitting linear equation that describes the relationship between the dependent and independent variables.

How does Linear Regression work?

Linear Regression works by estimating the coefficients of the linear equation that minimizes the sum of the squared differences between the actual and predicted values of the dependent variable. This process is called least squares estimation.

Example of Linear Regression in Python:

To perform linear regression using Python, you can use the scikit-learn library. Here’s a simple example:

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate sample data
X = np.random.rand(100, 1)
y = 3 * X + 4 + np.random.randn(100, 1) * 0.2

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Perform linear regression
lin_reg = LinearRegression()
lin_reg.fit(X_train, y_train)

# Test the linear regression model
y_pred = lin_reg.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")

Resources on Linear Regression: