What is Assertion Error: Torch not compiled with CUDA enabled?

If you’re a data scientist or a software engineer working with deep learning frameworks, you may have encountered the following error message: ‘AssertionError: Torch not compiled with CUDA enabled.’ This error occurs when you try to use Torch with CUDA, but the framework has not been compiled with CUDA support.

If you’re a data scientist or a software engineer working with deep learning frameworks, you may have received an error message stating, “AssertionError: Torch not compiled with CUDA enabled.” In programming, an assertion is a statement that a programmer confidently declares true. When this condition fails or doesn’t hold true, an AssertionError is triggered.

In this context, the error message implies that the Torch framework was expected to be compiled with CUDA (Compute Unified Device Architecture) support, a crucial requirement for certain deep learning operations. This problem usually arises when you try to use Torch with CUDA, but the Torch framework is not compiled with CUDA support.

Despite possessing hardware that supports CUDA, not having your deep learning software framework compiled or installed with CUDA compatibility can significantly limit your workflow’s performance. Not leveraging CUDA’s capacity means you’re missing out on significant performance gains that can drastically expedite your deep learning operations.

As they say, the devil is in the details. Understanding this error is the first step; resolving it comes next. This article will explore the CUDA’s importance for deep learning, how to verify if your Torch installation is CUDA-compatible, and how to overcome the “AssertionError: Torch not compiled with CUDA enabled” error.

What is CUDA?

Compute Unified Device Architecture (CUDA) is a parallel computing platform and application programming interface (API) developed by NVIDIA. It allows developers to use the power of NVIDIA GPUs (graphics processing units) for general-purpose computing, including deep learning.

Deep learning algorithms require performing a vast amount of computations, and GPUs—with their high parallelization capabilities—are remarkably suited to handle these tasks better than Central Processing Units (CPUs). But what makes GPUs superior? Unlike CPUs that have a few cores optimized for sequential serial processing, GPUs have hundreds of cores designed for handling multiple tasks simultaneously.

CUDA provides developers with the tools and functionalities needed to harness the raw computational power of NVIDIA’s GPUs. It allows developers to direct specific computing tasks to the more efficient GPU rather than the CPU. Developers can write code that is executed on the GPU, shifting the workload from being CPU-intensive to being GPU-intensive, allowing for much faster execution. Popular deep learning frameworks like PyTorch and TensorFlow have built-in CUDA support, enabling coders to train complex models on GPUs with relative ease, dramatically reducing processing time and boosting overall efficiency.

Before deep-diving into the particulars of the AssertionError, it’s important to double-check your system setup. Start with confirming hardware compatibility, ensuring necessary software packages and drivers are properly installed, and that your system components interact correctly. From here, you can determine if the error you’re receiving is from your environment or the deep learning framework you have compiled or downloaded.

Confirming GPU Drivers are Installed

For CUDA to work correctly, drivers must be installed to allow the API to interface with the driver, which interfaces with the GPU itself. To confirm if your GPU drivers are installed and functioning correctly, use the nvidia-smi command.

nvidia-smi (System Management Interface) is a tool incorporated into the NVIDIA driver package. This utility allows users to monitor and manage various attributes of their GPU. If you can output something similar to the output displayed below, this shows that you can query the GPU and that GPU drivers are installed properly.

$ nvidia-smi

+---------------------------------------------------------------------------------------+

| NVIDIA-SMI 535.103 Driver Version: 537.13 CUDA Version: 12.2 |

|-----------------------------------------+----------------------+----------------------+

| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |

|  |  | MIG M. |

|=========================================+======================+======================|

| 0 NVIDIA GeForce RTX 3090 On | 00000000:0A:00.0 On | N/A |

| 0% 44C P5 36W / 350W | 3277MiB / 24576MiB | 34% Default |

|  |  | N/A |

+-----------------------------------------+----------------------+----------------------+

  

+---------------------------------------------------------------------------------------+

| Processes: |

| GPU GI CI PID Type Process name GPU Memory |

| ID ID Usage |

|=======================================================================================|

| 0 N/A N/A 20 G /Xwayland N/A |

| 0 N/A N/A 22 G /Xwayland N/A |

| 0 N/A N/A 23 G /Xwayland N/A |

+---------------------------------------------------------------------------------------+

To use it, open a terminal, type nvidia-smi, and press “Enter.” If your GPU drivers are correctly installed, this command will provide information about the GPU model, utilization, memory usage, the active GPU processes, and the driver version, among other details. On the top right corner of the output, you’ll notice a CUDA version mentioned there. Take note of this CUDA version, as this is the maximum supported version the driver can support. In other words, any CUDA version equal to or lower than this version will be supported by this driver.

If you cannot query nvidia-smi, download Nvidia drivers here to prepare for installation. If you need to install both Nvidia drivers and CUDA. In that case, it is much easier to use the instructions in the next section to install CUDA, as the installation process will also automatically install the necessary drivers.

If you’re having issues installing Nvidia drivers due to an older version existing on your system, you can forcefully purge all Nvidia drivers using this command: sudo apt-get --purge remove "*nvidia*".

Confirming CUDA/Related Libraries are Installed/Compatible

With your GPU being detected, the next step is to ensure your system correctly detects CUDA. In addition to ensuring CUDA’s presence on your system, it’s crucial to ensure the CUDA version installed is compatible with your GPU’s architecture. Each CUDA version is designed to exploit features specific to different GPU architectures, known as Compute Capability (sm versions). The CUDA version must support your GPU’s “Compute Capability” to function.

For instance, Nvdia’s 30 series cards (Ampere architecture) cannot operate under CUDA 11, as the compute capabilities for the architecture are supported starting at CUDA 11. As an interesting note, newer CUDA versions allow for backward compatibility, allowing older compiled CUDA programs to run on newer CUDA versions, but not the other way around. In other words, if we had Tensorflow or PyTorch installed with CUDA 10 support in our example, we are essentially telling our 30 series GPU to run in a CUDA 10 environment.

We will then take this CUDA version and check to see if your GPU architecture is compatible with this CUDA version support table here.

Choosing your CUDA Version

The next check we’ll do is to ensure your deep learning framework supports your CUDA version. This is especially important if you use prebuilt binaries (i.e., installing via pip or another online source). If you would like to build PyTorch or Tensorflow from source, these guides can be found in the additional resources at the bottom of this blog post. Building these frameworks from source will give you the most optimal performance for your system and allow you more flexibility in which CUDA version you would like the deep learning framework to utilize.

Checking CUDA compatibility with your framework can be done by referencing the deep learning framework’s compatibility table:

  • For Tensorflow: https://www.tensorflow.org/install/source#gpu

    • Note: Tensorflow requires the installation of cuDNN and CUDA (TensorRT is optional)
  • For PyTorch: https://pytorch.org/get-started/locally/

    • Older versions can be found here.

    • Note: Tensorflow requires the installation of cuDNN and CUDA (TensorRT is optional).

    • Note: You may notice that PyTorch installations always have an additional package installed that contains the specific CUDA library that the PyTorch version was built with. PyTorch will always opt to use this version over the default CUDA version on your system. Just make sure your driver supports the CUDA version!

Once you know which CUDA version to use, proceed to the following sections.

Removing All Nvidia Libraries (Optional)

If you’re having issues installing CUDA-related libraries due to previous installations, you can purge these libraries:

  • For CUDA-related libraries: sudo apt-get --purge remove "*cublas*" "cuda*" "nsight*"

  • For the Nvidia driver: sudo apt-get --purge remove "*nvidia*"

Verifying CUDA on the System

You can check the presence of CUDA primarily through the command line. Run this command: nvcc --version. ‘nvcc’ stands for the NVIDIA CUDA Compiler. It should return details about the installed CUDA compilation tools, including the version number. If the system responds that the command isn’t recognized, it could indicate that CUDA is not installed or correctly set up. If you are confident that you have installed CUDA, but the command is not working, make sure you add CUDA to your system path:

  1. Enter sudo nano ~/.bashrc in your terminal.

  2. Add these lines into ~/.bashrc:

    a. export PATH=/usr/local/cuda/bin:$PATH

    b. export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

  3. Source your new ~/.bashrc file to apply changes: source ~/.bashrc.

If you still need to install CUDA and/or Nvidia drivers, click here. If you would like an older CUDA version, check here. From experience, I always recommend the “deb (network)” installation option since upgrading and package maintenance are done automatically via apt upgrade.

Verifying cuDNN on the System

cuDNN, or CUDA Deep Neural Network library, is a crucial component for running deep learning frameworks. This GPU-accelerated library designed by NVIDIA is highly optimized for deep neural network computations and is typically used with the CUDA software stack.

The simplest way to check the cuDNN version installed in your system is through the command line. The header file cudnn.h contains the version number. You just have to find this file and view the version information.

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

Ensure to replace /usr/local/cuda with your CUDA installation path if it’s installed in a non-default location.

The command will output a few lines containing the cuDNN version information:

#define CUDNN_MAJOR 8

#define CUDNN_MINOR 0

#define CUDNN_PATCHLEVEL 4

It’s also essential to ensure your cuDNN version is compatible with your CUDA version. You can reference compatibility by checking the cuDNN Support Matrix here.

If you do not have cuDNN installed on your system, click here for installation steps.

If your system has passed all the checks above, there is a high likelihood that you might have an issue with the variant of the prebuilt framework binary you have installed. The following checks are below to ensure your framework matches your system’s installed libraries.

Checking Framework for CUDA support

Prebuilt binaries are compiled and linked to specific CUDA versions and libraries. Therefore, your system must have those available for that particular framework’s version. For convenience:

  • For Tensorflow: https://www.tensorflow.org/install/source#gpu

    • Note: Tensorflow requires the installation of cuDNN and CUDA (TensorRT is optional)
  • For PyTorch: https://pytorch.org/get-started/locally/

    • Older versions can be found here.

    • Note: Tensorflow requires the installation of cuDNN and CUDA (TensorRT is optional).

    • Note: You may notice that PyTorch installations always have an additional package installed that contains the specific CUDA library that the PyTorch version was built with. PyTorch will always opt to use this version over the default CUDA version on your system. Just make sure your driver supports the CUDA version!

Torch

To check if your Torch installation has CUDA support, you can run the following command in a Python shell:

import  torch

print(torch.cuda.is_available())

If the output of this command is True, then Torch has been compiled with CUDA support, and you should be able to use it with the GPU. If the output is False, Torch does not have CUDA support, and you will need to either install the correct binary with the correct CUDA version or recompile it with your current CUDA version.

TensorFlow

Just as with PyTorch, you can also check CUDA’s availability on your system using TensorFlow.

To verify, type this into a Python shell:

import tensorflow as tf

print("CUDA Available: ", tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None))

If your system is correctly set up with CUDA, this command should return “CUDA Available: True.” If not, it will return “CUDA Available: False,” indicating that TensorFlow does not have access to CUDA on your system.

Note: if tf.test.is_gpu_available() is deprecated and will be removed in a newer version, you can use: tf.config.list_physical_devices('GPU').

Installing the Correct Package Variant

If you have a previous installation of the deep learning framework you wish to install, make sure you pip uninstall the package. To ensure you are installing the correct prebuilt binary for your package, adhere to the following pip commands:

  • For Tensorflow: https://www.tensorflow.org/install/pip#package_location

    • Select the correct .whl file that describes your system. Right-click on the link of the .whl and use it in your pip install command.

    • i.e., pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-2.13.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

  • For PyTorch: https://pytorch.org/get-started/locally/

    • Older versions can be found here.

    • Use the commands listed on the page, depending on your environment.

After you have installed the correct variant with CUDA support, you should be able to use your GPU without encountering the “AssertionError: Torch not compiled with CUDA enabled” error.

Conclusion

In this article, we’ve dissected the crucial role CUDA plays in deep learning, verifying CUDA support in Torch, and rectifying the “AssertionError: Torch not compiled with CUDA enabled” error.

Compiling libraries like Torch or TensorFlow from source can provide several tangible benefits. It allows you to potentially enhance performance by creating binaries optimized for your specific hardware configuration. Additionally, it enables using the latest, possibly unreleased, features and improvements directly from the library’s repository. Furthermore, it offers the flexibility to customize the library in terms of features, functionality, and links to alternative library versions like CUDA, cuDNN, and much more.

Indeed, this process can be more complex and time-consuming than pre-compiled binaries. Still, the granular control and potential performance gains can often outweigh these challenges, especially in a research or production environment.

By following the outlined steps, you’re strengthening your foundation to unlock the true potential of GPUs for deep learning. The ability to train deep learning models with CUDA support can significantly elevate computation speed compared to a CPU alone. This acceleration enables quicker iterations and experimentations, leading to more efficient and effective deep learning model development. Embracing these practices could be transformational in your deep learning journey.

Additional Resources


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.