How to Sort Pandas DataFrame by One or Multiple Column

As a data scientist or software engineer you may find yourself working with large datasets that require sorting for analysis Pandas is a popular Python library for data manipulation and analysis and it provides several methods for sorting data

In this article, we will explore how to sort a Pandas DataFrame from one column.

What is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional table-like data structure that is used to store and manipulate data in Python. It is similar to a spreadsheet or a SQL table and can be used to perform various operations such as filtering, grouping, and sorting data.

How to Sort Pandas DataFrame From One Column

Sorting a Pandas DataFrame from one column can be done using the sort_values() method. The sort_values() method sorts the DataFrame based on the values of one or more columns.

Suppose we have a DataFrame df with the following data:

NameAgeCity
Alice25New York
Bob30San Diego
John20Chicago
Mary22Houston

To sort the DataFrame by the Age column in ascending order, we can use the following code:


import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'John', 'Mary'],
    'Age': [25, 30, 20, 22],
    'City': ['NY', 'SD', 'Chicago', 'Houston']
})

df = df.sort_values('Age')
print(df)

This will result in the following DataFrame:

NameAgeCity
John20Chicago
Mary22Houston
Alice25New York
Bob30San Diego

We can also sort the DataFrame by the Age column in descending order, we can use the following statement:

df = df.sort_values('Age', ascending=False)

This will result in the following DataFrame:

NameAgeCity
Bob30San Diego
Alice25New York
Mary22Houston
John20Chicago

Note that we set the ascending parameter to False to sort the values in descending order.

In addition to sorting by one column, we can also sort by multiple columns. To sort by multiple columns, we pass a list of column names to the sort_values() method. The DataFrame will be sorted by the first column in the list, and then by the second column in the list if there are ties.

df = df.sort_values(['City', 'Age'])

This will result in the following DataFrame:

NameAgeCity
John20Chicago
Mary22Houston
Alice25New York
Bob30San Diego

The DataFrame is first sorted by the City column in ascending order, and then by the Age column in ascending order if there are ties.

Conclusion

Sorting a Pandas DataFrame from one column is a common task in data analysis. In this article, we explored how to use the sort_values() method to sort a DataFrame by one or more columns. By following these simple steps, you can easily sort your data and perform various operations on it.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.