How to Format Numbers in a Python Pandas DataFrame as Currency in Thousands or Millions

As a data scientist or software engineer, you may often work with large datasets that contain numerical values. It is important to be able to quickly and easily format these values in a way that is easy to read and understand. One common formatting requirement is to display numbers as currency in thousands or millions. In this article, we will explore how to format numbers in a Python pandas DataFrame as currency in thousands or millions.

As a data scientist or software engineer, you may often work with large datasets that contain numerical values. It is important to be able to quickly and easily format these values in a way that is easy to read and understand. One common formatting requirement is to display numbers as currency in thousands or millions. In this article, we will explore how to format numbers in a Python pandas DataFrame as currency in thousands or millions.

Table of Contents

  1. Introduction
  2. What is Python Pandas?
  3. Formatting Numbers in a Pandas DataFrame
  4. Formatting Numbers in Thousands
  5. Formatting Numbers in Millions
  6. Conclusion

What is Python Pandas?

Before we dive into formatting numbers in a pandas DataFrame, let’s first discuss what pandas is and why it is useful. Pandas is a popular open-source Python library used for data manipulation and analysis. It provides powerful data structures for working with structured data, such as tables or databases. Pandas is often used in data science and machine learning projects, as well as in finance, economics, and other fields that require data analysis.

Formatting Numbers in a Pandas DataFrame

In pandas, a DataFrame is a two-dimensional tabular data structure with labeled axes (rows and columns). A DataFrame can contain various types of data, including numerical values, strings, and categorical data. When working with numerical data in a DataFrame, it is often useful to format the values as currency.

To format numbers in a pandas DataFrame as currency, we can use the map() method along with the format() function. The map() method applies a function to each element of a DataFrame, while the format() function formats a value as a string with a specified format.

Formatting Numbers in Thousands

To format numbers in thousands, we can divide each value in the DataFrame by 1000 and then use the map() method to apply the format() function to each element. The format() function takes a format string as its argument, which specifies how the value should be formatted. In this case, we will use the format string "{:,.0f}K", which formats the value as a comma-separated integer with no decimal places, followed by the letter “K” to indicate thousands.

import pandas as pd

# create a sample DataFrame with numerical values
data = {"Value": [1000, 2000, 3000, 4000]}
df = pd.DataFrame(data)

# format the values as currency in thousands
df["Value"] = df["Value"] / 1000
df["Value"] = df["Value"].map("${:,.0f}K".format)

print(df)

Output:

   Value
0  $1K
1  $2K
2  $3K
3  $4K

Formatting Numbers in Millions

To format numbers in millions, we can divide each value in the DataFrame by 1,000,000 and then use the map() method to apply the format() function to each element. In this case, we will use the format string "{:,.2f}M", which formats the value as a comma-separated float with two decimal places, followed by the letter “M” to indicate millions.

import pandas as pd

# create a sample DataFrame with numerical values
data = {"Value": [1000000, 2000000, 3000000, 4000000]}
df = pd.DataFrame(data)

# format the values as currency in millions
df["Value"] = df["Value"] / 1000000
df["Value"] = df["Value"].map("${:,.2f}M".format)

print(df)

Output:

   Value
0  $1.00M
1  $2.00M
2  $3.00M
3  $4.00M

Conclusion

Formatting numbers as currency in thousands or millions is a common requirement when working with numerical data in a pandas DataFrame. In this article, we have shown how to use the map() method and the format() function to format numerical values as currency in thousands or millions. By following these examples, you can easily apply this formatting to your own pandas DataFrames, making your data more accessible and easier to understand.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.