# Chi-squared Test

## What is the Chi-squared Test?

The Chi-squared test is a statistical hypothesis test used to determine whether there is a significant association between two categorical variables in a sample. It is based on comparing the observed frequencies in a contingency table with the expected frequencies that would occur if the variables were independent. The Chi-squared test is commonly used for feature selection in machine learning, as it can help identify the most relevant features for a given classification task.

## Example of using the Chi-squared Test in Python

Here’s a simple example of performing a Chi-squared test using the `scipy` library in Python:

``````import numpy as np
from scipy.stats import chi2_contingency

# Sample contingency table
observed = np.array([[10, 20, 30], [20, 30, 20]])

# Perform the Chi-squared test
chi2, p_value, dof, expected = chi2_contingency(observed)

print("Chi-squared statistic:", chi2)
print("P-value:", p_value)
print("Degrees of freedom:", dof)
print("Expected frequencies:", expected)
``````

This example demonstrates how to use the `chi2_contingency` function from the `scipy.stats` module to perform a Chi-squared test on a sample contingency table.