# How to Sum Two Columns in a Pandas DataFrame

As a data scientist or software engineer, you may often need to perform calculations on data stored in a Pandas DataFrame. One common task is to sum two columns in a DataFrame. In this article, we will discuss different ways to achieve this using Pandas.

## What is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional labeled data structure with columns of potentially different types. It is similar to a spreadsheet or a SQL table, but with more powerful features. Pandas is an open-source library in Python, widely used for data manipulation and analysis.

## How to Sum Two Columns in a Pandas DataFrame

Suppose we have a DataFrame with two columns, `column1`

and `column2`

, and we want to create a new column `sum`

that contains the sum of these two columns. Here are three methods to achieve this:

### Method 1: Using the + Operator

The simplest way to add two columns in a Pandas DataFrame is to use the `+`

operator. We can create a new column `sum`

by adding the two columns together, like this:

```
import pandas as pd
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]})
# add 2 columns using + operator
df['sum'] = df['column1'] + df['column2']
print(df)
```

Output:

```
column1 column2 sum
0 1 4 5
1 2 5 7
2 3 6 9
```

In this example, we create a DataFrame with two columns `column1`

and `column2`

, each containing three values. We then add these two columns together using the `+`

operator and assign the result to a new column `sum`

.

### Method 2: Using the sum() Function

Another way to add two columns in a Pandas DataFrame is to use the `sum()`

function. We can create a new column `sum`

by applying the `sum()`

function to the two columns, like this:

```
import pandas as pd
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]})
# add 2 columns using sum()
df['sum'] = df[['column1', 'column2']].sum(axis=1)
print(df)
```

Output:

```
column1 column2 sum
0 1 4 5
1 2 5 7
2 3 6 9
```

In this example, we create a DataFrame with two columns `column1`

and `column2`

, each containing three values. We then select these two columns using `df[['column1', 'column2']]`

, apply the `sum()`

function along the rows (`axis=1`

), and assign the result to a new column `sum`

.

### Method 3: Using the apply() Function

A third way to add two columns in a Pandas DataFrame is to use the `apply()`

function. We can create a new column `sum`

by applying a lambda function that adds the two columns together, like this:

```
import pandas as pd
df = pd.DataFrame({'column1': [1, 2, 3], 'column2': [4, 5, 6]})
# add 2 columns using apply()
df['sum'] = df.apply(lambda row: row['column1'] + row['column2'], axis=1)
print(df)
```

Output:

```
column1 column2 sum
0 1 4 5
1 2 5 7
2 3 6 9
```

In this example, we create a DataFrame with two columns `column1`

and `column2`

, each containing three values. We then apply a lambda function to each row of the DataFrame using the `apply()`

function. The lambda function takes a row as input and returns the sum of the two columns. We assign the result to a new column `sum`

.

## Conclusion

In this article, we discussed different ways to add two columns in a Pandas DataFrame. We showed how to use the `+`

operator, the `sum()`

function, and the `apply()`

function. While all three methods achieve the same result, they differ in terms of readability, performance, and flexibility. The `+`

operator is the simplest and most intuitive method, but it may not be the most efficient for large datasets. The `sum()`

function is more flexible and can handle missing values, but it requires more typing. The `apply()`

function is the most flexible and can handle complex operations, but it may be slower than the other methods. As a data scientist or software engineer, you should choose the method that best fits your needs and context.

#### About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Join today and get 150 hours of free compute per month.