How to Add New Rows to a Pandas Dataframe

In this blog, we explore the process of appending new rows to a pandas dataframe, a common operation for data scientists handling diverse data sources. Learn how to accomplish this task efficiently using fundamental pandas functions.

As a data scientist, you may encounter situations where you need to add new rows to a pandas dataframe. This can be a common task when working with data from various sources, and it can be easily achieved using some simple pandas functions. In this article, we will discuss the steps required to add new rows to a pandas dataframe.

What Is a Pandas Dataframe?

A pandas dataframe is a two-dimensional labeled data structure with columns of potentially different types. It is similar to a spreadsheet or SQL table, where the rows represent observations and the columns represent variables. A dataframe can be created from various data sources such as CSV files, Excel spreadsheets, SQL databases, and more. Pandas is a popular data manipulation library in Python that provides powerful tools for data cleaning, analysis, and visualization.

How to Add New Rows to a Pandas Dataframe

Adding new rows to a pandas dataframe is a straightforward process. We can use the loc or append function to add a new row to an existing dataframe. The loc function is a label-based function that allows us to access a group of rows and columns by their labels or a boolean array where append function allow you to directly add a new row to the end of the DataFrame. Here are the steps:

Step 1: Create a New Row

Let’s say we have a dataframe as follow:

To add a new row to a pandas dataframe, we first need to create a new row. We can do this by creating a dictionary with the column names as the keys and the new values as the values. For example, let’s create a new row with the following data:

import pandas as pd

# Load data into a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Marie'],
    'Age': [25, 30, 35, 40, 20],
    'City': ['New York', 'Los Angeles', 'San Francisco', 'Chicago', 'Washington']
}

df = pd.DataFrame(data)
print(df)

Output:

        Name	Age	City
0	Alice	25	New York
1	Bob	30	Los Angeles
2	Charlie	35	San Francisco
3	David	40	Chicago
4	Marie	20	Washington

And we want to add the following row to the existing dataframe

name = 'John'
age = 35
city = 'New York'

We can create a dictionary with these values as follows:

new_row = {'name': 'John', 'age': 35, 'city': 'New York'}

Step 2: Add the New Row

Use the loc Function

Once we have a new row, we can use the loc function to add it to the pandas dataframe. The loc function allows us to access a group of rows and columns by their labels, and it returns a dataframe that contains the selected rows and columns. Here is the syntax to add a new row to a pandas dataframe using the loc function:

df.loc[len(df)] = new_row

In this example, we are using the len function to get the length of the dataframe df and add the new row to the end of the dataframe. The loc function takes two arguments: the row label and the column label. By using len(df) as the row label, we are adding the new row to the end of the dataframe. The new_row dictionary contains the values for the new row.

Use the append Function

df = df.append(new_row, ignore_index=True)

Here we are using the append method to add a new row to the DataFrame df. By setting the ignore_index parameter to True, you ensure that the new row is added to the DataFrame with a new index, maintaining a continuous index sequence.

Step 3: Verify the New Row

After adding the new row, we can verify that it has been added to the pandas dataframe using the tail function. The tail function returns the last n rows of the dataframe, where n is an integer parameter. We can use this function to check that the new row has been added to the end of the dataframe. Here is an example:

print(df.tail(1))

Output:

        Name	Age	City
5	John	35	New York

This will print the last row of the dataframe df, which should be the new row that we just added.

Conclusion

Adding new rows to a pandas dataframe is a simple process that can be done using the loc function. We first need to create a new row using a dictionary with the column names as the keys and the new values as the values. We can then use the loc function to add the new row to the end of the dataframe. Finally, we can verify that the new row has been added using the tail function.

Pandas is a powerful library that provides a wide range of functions for data manipulation, analysis, and visualization. Understanding how to add new rows to a pandas dataframe is an essential skill for any data scientist or software engineer who works with data. By following these simple steps, you can easily add new rows to pandas dataframes and manipulate your data with ease.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.