How to count rows in Pandas

How to count the number of rows in your DataFrame

You can count the number of rows in a pandas DataFrame using len() or DataFrame.shape. Here’s a quick example:

import pandas as pd

data = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [10, 20, 30, 40, 50]})

#three different ways to count rows
len(data)
len(data.index)
data.shape[0] 

All three commands above return a row count. If you’re looking to shave milliseconds off of your computation time, len(data.index) is the fastest of the three, but the difference is negligible in most cases as all are constant time operations. The same methods can also be used to count columns using len(data.columns) or data.shape[1].

If you want to only the number of non-null entries, use DataFrame.count(). This method will not count values None, NaN, NaT, and optionally numpy.inf, so if you need true row counts stick with the options outlined above. Because not every column will necessarily contain the same number of non-null values, count() returns a DataFrame with a value for each column:

import pandas as pd
import numpy as np

data = pd.DataFrame({'a': [1, np.nan, 3, 4, 5], 'b': [10, 20, 30, 40, 50]})

data.count()

To count non-null entries per row, you can use data.count(axis=1) or data.count(axis='columns').

Finally, if you’d like to count rows by condition, you can use DataFrameGroupBy.size() or DataFrameGroupBy.count(). Here, size() returns a Series of true row counts per group, while count() returns a DataFrame of counts of non-null values per group:

import pandas as pd
import [numpy](https://saturncloud.io/glossary/numpy) as np

data = pd.DataFrame({'a': [1, np.nan, 3, 4, 5], 'b': [10, 20, 30, 40, 50], 'c': "X X X Y Y".split()})

data.groupby('c').size()

data.groupby('c').count()

In summary, len() or DataFrame.shape are usually go-to options for counting rows in Pandas. DataFrame.count() is useful when you need to count non-null values in each column.

Additional Resources:


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.