Logistic Regression

What is Logistic Regression?

Logistic Regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (binary outcome). It is used to predict a binary outcome (1 / 0, Yes / No, True / False) given a set of independent variables.

How does Logistic Regression work?

Logistic Regression works by using the logistic function to model the probability of the binary outcome as a function of the independent variables. The logistic function, also known as the sigmoid function, maps the input to a value between 0 and 1, representing the probability of the positive outcome.

Example of Logistic Regression in Python

To perform logistic regression using Python, you can use the scikit-learn library. Here’s a simple example:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, (iris.target == 2)  # Set target to 1 if it's the third class, 0 otherwise

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Perform logistic regression
log_reg = LogisticRegression(solver='liblinear')
log_reg.fit(X_train, y_train)

# Test the logistic regression model
y_pred = log_reg.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

Resources on Logistic Regression