How to Set Decimal Precision of a Pandas Dataframe Column with Decimal Datatype

In this blog, if you’re a data scientist or software engineer dealing frequently with numerical data, precision in manipulating decimal data becomes crucial. It is imperative to guarantee that decimal values remain intact without rounding or truncation during various calculations or operations. Python’s Pandas library emerges as a formidable tool in this context, offering a robust and efficient approach to handling data, particularly with its support for the Decimal datatype when working with decimal data.

As a data scientist or software engineer, you may often work with numerical data and need to manipulate decimal data with precision. In such cases, it is essential to ensure that the decimal values are not rounded off or truncated during calculations or operations. Pandas is a powerful library in Python that provides a robust and efficient way to work with data, and it offers the ability to work with decimal data using the Decimal datatype.

In this article, we will explore how to set the decimal precision of a Pandas dataframe column with a datatype of Decimal. We will cover the following topics:

  • What is the Decimal datatype?
  • How to create a Pandas dataframe with Decimal datatype columns.
  • How to set the decimal precision of a Pandas dataframe column with a Decimal datatype.
  • How to apply the decimal precision setting to all Decimal datatype columns in a Pandas dataframe.

Table of Contents

  1. What is the Decimal datatype?What is the Decimal datatype?
  2. How to create a Pandas dataframe with Decimal datatype columns
  3. How to set the decimal precision of a Pandas dataframe column with a Decimal datatype
  4. How to apply the decimal precision setting to all Decimal datatype columns in a Pandas dataframe
  5. Common Errors and How to Handle Them
  6. Conclusion

What is the Decimal datatype?

The Decimal datatype is a fixed-point datatype that provides a precise way to work with decimal numbers in Python. Unlike the float datatype, which is a floating-point datatype and has limited precision, Decimal provides an arbitrary precision and can accurately represent decimal numbers with up to 28-29 significant digits.

To use the Decimal datatype in Python, you need to import the decimal module. The following code snippet demonstrates how to create a Decimal object:

from decimal import Decimal

num = Decimal('3.1415926535897932384626433832')
print(num)

The output of the above code will be:

3.1415926535897932384626433832

How to create a Pandas dataframe with Decimal datatype columns

To create a Pandas dataframe with Decimal datatype columns, you need to define the datatype of the column as Decimal when creating the dataframe. The following code snippet demonstrates how to create a Pandas dataframe with two columns, one with a Decimal datatype and the other with a floating-point datatype:

import pandas as pd
from decimal import Decimal

data = {
    'DecimalCol': [Decimal('3.1415'), Decimal('6.2832'), Decimal('9.4247')],
    'FloatCol': [3.1415, 6.2832, 9.4247]
}

df = pd.DataFrame(data)
print(df)

The output of the above code will be:

   DecimalCol  FloatCol
0      3.1415    3.1415
1      6.2832    6.2832
2      9.4247    9.4247

How to set the decimal precision of a Pandas dataframe column with a Decimal datatype

Method 1: Using apply with round()

To set the decimal precision of a Pandas dataframe column with a Decimal datatype, you can use the round() method. The round() method rounds the Decimal object to the specified number of decimal places and returns a new Decimal object. You can then assign the rounded value back to the column.

The following code snippet demonstrates how to set the decimal precision of the DecimalCol column in the dataframe to 2 decimal places:

df['DecimalCol'] = df['DecimalCol'].apply(lambda x: round(x, 2))
print(df)

The output of the above code will be:

   DecimalCol  FloatCol
0        3.14    3.1415
1        6.28    6.2832
2        9.42    9.4247

As you can see, the DecimalCol column now has a precision of 2 decimal places.

Method 2: Applying the decimal module

The decimal module provides precise control over decimal arithmetic. You can use it to set the precision for a DataFrame column with Decimal datatype.

from decimal import Decimal, ROUND_HALF_UP

df['DecimalCol'] = df['DecimalCol'].apply(lambda x: Decimal(x).quantize(precision))
print(df)

Output:

  DecimalCol  FloatCol
0       3.14    3.1415
1       6.28    6.2832
2       9.42    9.4247

How to apply the decimal precision setting to all Decimal datatype columns in a Pandas dataframe

If you have multiple columns with Decimal datatype in a Pandas dataframe, you may want to apply the decimal precision setting to all columns at once. You can achieve this by looping through all the columns in the dataframe and applying the round() method to each column that has a Decimal datatype.

The following code snippet demonstrates how to apply the decimal precision setting to all Decimal datatype columns in a Pandas dataframe:

for col in df.columns:
    if isinstance(df[col].iloc[0], Decimal):
        df[col] = df[col].apply(lambda x: round(x, 2))

print(df)

The output of the above code will be the same as the output of the previous example.

Common Errors and How to Handle Them

Error 1: Type Mismatch

When applying the Decimal datatype, ensure that the column contains numeric values.

Error 2: Rounding Issues

Be cautious about rounding issues, especially when dealing with floating-point numbers. Adjust the precision accordingly.

Error 3: NaN Values

Handle NaN values explicitly to avoid unexpected behavior when applying precision adjustments.

Conclusion

In this article, we have learned how to set the decimal precision of a Pandas dataframe column with a Decimal datatype. We have covered the Decimal datatype, how to create a Pandas dataframe with Decimal datatype columns, how to set the decimal precision of a Pandas dataframe column with a Decimal datatype, and how to apply the decimal precision setting to all Decimal datatype columns in a Pandas dataframe.

By following the steps outlined in this article, you can ensure that the decimal data in your Pandas dataframe is accurately represented with the desired precision.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.