How to reorder columns in Pandas

Learn how to change the order of columns in a DataFrame


There are several ways to change the order of columns in a pandas DataFrame; which to choose will depend on the size of your dataset and the transformation you want to perform.

If you have a relatively small dataset and/or need to specify a custom column order, you can simply reassign columns in the order you want them (note the double brackets):

import pandas as pd

data = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 20, 30], 'c': [100, 200, 300], 'd': [4, 5, 6]})

data = data[['b', 'd', 'c', 'a']]

data

You can also accomplish the same thing using the built-in reindex method, which can give a slight performance boost over the solution above:

data = data.reindex(columns=['b', 'd', 'c', 'a'])

While this is manageable for datasets with only a few columns, for larger datasets, manually writing out all column names can be cumbersome. If you need to rearrange all columns into a custom order outside of simple sorting, there’s not really a good way around this. However, if you just need to move a single column in your dataset, the solution is much simpler. Here is one way to go about it:

#move column 'd' to the beginning
data = data[['d'] + [col for col in data.columns if col != 'd']]

#move column 'd' to the end
data = data[[col for col in data.columns if col != 'd'] + ['d']]

To move a column to a particular index, you can use pop() and insert(); note that insert() modifies the DataFrame in-place.

#move column 'd' to be second from the left (index 1)

col = data.pop('d')
data.insert(1, col.name, col)

In cases where you simply want to sort columns by name, you can use reindex from above with the axis parameter:

data = data.reindex(sorted(data.columns), axis=1)

In summary, depending on the size of your dataset, you may be able to simply reassign or reindex DataFrame columns to the desired order. For larger datasets, you may be better off choosing a solution that doesn’t require writing out all column names.

Additional Resources:


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Join today and get 150 hours of free compute per month.