How to Format Thousand Separator for Integers in a Pandas DataFrame

As a data scientist or software engineer you may often encounter situations where you need to format numerical data in a pandas DataFrame One common task is to add thousand separators to large integers to make them more readable

In this article, we will show you how to format thousand separators for integers in a pandas DataFrame.

What is a Thousand Separator?

A thousand separator is a symbol used to separate groups of digits in large numbers to make them more readable. In many countries, a comma (,) is used as a thousand separator, while others use a period (.) or a space. For example, the number 1000000 can be written as 1,000,000 (comma-separated), 1.000.000 (period-separated), or 1 000 000 (space-separated).

Formatting Thousand Separators in a Pandas DataFrame

Pandas is a popular data manipulation library in Python, commonly used by data scientists and software engineers. In pandas, we can format numerical data using the map() method. This method applies a function to each element of a DataFrame.

Note: applymap() has been deprecated. We are using map() instead

To format thousand separators for integers in a pandas DataFrame, we can define a function that takes a number as input and returns a string representation of the number with thousand separators.

def format_int_with_commas(x):
    """
    Formats an integer with commas as thousand separators.
    """
    return f"{x:,}"

In this function, we use Python’s f-string formatting syntax to format the number with commas as thousand separators ({x:,}). The comma in the curly braces tells Python to use the comma as a thousand separator.

We can then apply this function to each element of the DataFrame using the map() method.

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'A': [1000, 2000000, 300000000],
    'B': [4000, 5000000, 600000000],
    'C': [7000, 8000000, 900000000]
})

# Apply the format_int_with_commas function to each element of the DataFrame
df = df.map(format_int_with_commas)

print(df)

Output:

             A            B             C
0        1,000        4,000         7,000
1    2,000,000    5,000,000     8,000,000
2  300,000,000  600,000,000  9,000,000,000

In this example, we create a sample DataFrame with three columns (A, B, and C) containing large integers. We then apply the format_int_with_commas function to each element of the DataFrame using the map() method. The resulting DataFrame contains the same values, but with thousand separators added for readability.

Note that the map() method applies the function to each element of the DataFrame, so it works for both integer and float data types. If you want to apply the function to a specific column or subset of columns, you can use the apply() method instead.

Conclusion

Formatting numerical data with thousand separators is a common task in data analysis and visualization. In this article, we showed you how to format thousand separators for integers in a pandas DataFrame using the map() method and a custom function. By using this technique, you can improve the readability of your data and make it easier to understand.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.