How to Use If-Else Function in Pandas DataFrame

In this blog, explore the fundamental concept of conditional statements, focusing on the widely used if-else statement in programming. Tailored for data scientists and software engineers, uncover how this statement executes specific code blocks based on true or false conditions, enhancing your understanding of efficient coding practices.

As a data scientist or software engineer, you are probably familiar with the concept of conditional statements. One of the most common conditional statements used in programming is the if-else statement. The if-else statement is used to execute a block of code if a certain condition is true, and another block of code if the condition is false.

In this article, we will discuss how to use the if-else function in Pandas DataFrame. Pandas is a popular data analysis library in Python that provides various functions to manipulate and analyze data.

What is Pandas DataFrame?

Before we dive into the if-else function, let’s first understand what a Pandas DataFrame is. A Pandas DataFrame is a two-dimensional size-mutable, tabular data structure with columns of potentially different types. It is similar to a spreadsheet or SQL table, where each column represents a variable, and each row represents an observation.

Pandas provides various functions to manipulate and analyze data in a DataFrame. One of the most useful functions is the if-else function, which allows you to apply a certain condition to a DataFrame and return a new DataFrame with the desired values.

How to Use If-Else Function in Pandas DataFrame

To use the if-else function in Pandas DataFrame, you can use the apply() function along with a lambda function. The apply() function applies a function along an axis of the DataFrame. The lambda function is a short, anonymous function that takes in a value and returns a value based on a certain condition.

Let’s take an example to understand how the if-else function works in Pandas DataFrame. Suppose you have a DataFrame with two columns, 'Name' and 'Score', and you want to add a third column 'Result', which will contain the value 'Pass' if the score is greater than or equal to 50 and 'Fail' otherwise.

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'Dave', 'Eve'],
                   'Score': [70, 40, 60, 80, 30]})

# Apply if-else function to create a new column 'Result'
df['Result'] = df['Score'].apply(lambda x: 'Pass' if x >= 50 else 'Fail')

# Print the DataFrame
print(df)

Output:

       Name  Score Result
0     Alice     70   Pass
1       Bob     40   Fail
2   Charlie     60   Pass
3      Dave     80   Pass
4       Eve     30   Fail

In the example above, we have created a new column 'Result' using the if-else function. We have applied the lambda function to the ‘Score’ column and checked if the score is greater than or equal to 50. If the condition is true, we have returned the value 'Pass', and if the condition is false, we have returned the value 'Fail'.

Another way to use if-else without using lambda function is as follow:

# Applying if-else statement to categorize students
df['Result'] = ['Pass' if score >= 50 else 'Fail' for score in df['Score']]

The provided code produces identical results, but it is more accessible for individuals who may not be familiar with Python’s lambda function.

Conclusion

The if-else function is a powerful tool that allows you to apply a certain condition to a Pandas DataFrame and return a new DataFrame with the desired values. It is a useful function for data analysis and manipulation, and it can help you save time and effort when working with large datasets.

In this article, we have discussed how to use the if-else function in Pandas DataFrame. We have used a simple example to illustrate how the function works and how it can be applied to a DataFrame. We hope this article has been informative and helpful, and we encourage you to explore the various functions provided by Pandas to manipulate and analyze data.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.