How to Set Environment Variables in Jupyter Notebooks A Guide for Data Scientists

As data scientists were all familiar with the power and flexibility offered by Jupyter Notebooks They provide an interactive environment that allows us to develop document and execute code visualize data and share our work with others However there may be times when you need to set environment variables in your Jupyter Notebook to access external resources manage configuration settings or store sensitive information securely

In this blog post, we’ll explore several ways to set environment variables in Jupyter Notebooks, ensuring your projects run smoothly and securely.

Table of Contents

  1. Understanding Environment Variables
  2. Setting Environment Variables within the Notebook
  3. Using a Configuration File
  4. Running Jupyter Notebooks with a Custom Environment
  5. Storing Sensitive Information Securely
  6. Conclusion

Understanding Environment Variables

Environment variables are key-value pairs that are stored within the operating system and made available to all running processes. They are commonly used to store configuration information, such as API keys, file paths, or database credentials, and can be accessed programmatically by applications. In the context of Jupyter Notebooks, environment variables can be used to store information that your code needs to run correctly or to access external resources.

Setting Environment Variables within the Notebook

One of the simplest ways to set environment variables in a Jupyter Notebook is to use Python’s os module. The os.environ object is a dictionary-like mapping of environment variable names to their values. You can add, modify, or remove environment variables directly using this object.

Here’s an example of how to set an environment variable within a Jupyter Notebook:

import os

os.environ['MY_API_KEY'] = 'your-api-key-here'

# Access the environment variable later in your code
api_key = os.environ['MY_API_KEY']

Keep in mind that any environment variables set this way will only be available within the current Notebook session. If you restart the kernel or open a new Notebook, the environment variables you set previously will be lost.

Using a Configuration File

A more organized approach to managing environment variables is to use a configuration file. This file can store all your environment variables in one place, making it easy to manage and maintain.

One popular format for configuration files is the .env file, which stores environment variables as key-value pairs. Here’s an example of a .env file:

DATABASE_URL=postgres://user:password@localhost/db_name
API_KEY=my-api-key

To load environment variables from a .env file, you can use the python-dotenv library, which can be installed using pip:

pip install python-dotenv

Then, in your Jupyter Notebook, you can load the environment variables like this:

from dotenv import load_dotenv
import os

load_dotenv()

# Access the environment variables from the .env file
database_url = os.environ.get('DATABASE_URL')
api_key = os.environ.get('API_KEY')

This method allows you to use the variables in different notebooks.

This approach keeps your environment variables organized and separate from your code, making it easy to share your code without exposing sensitive information.

Running Jupyter Notebooks with a Custom Environment

Another way to set environment variables for your Jupyter Notebook is to run the Notebook server with a custom environment. You can create a virtual environment, activate it, and then set the required environment variables before running the Jupyter Notebook server.

Here’s an example of how to create a virtual environment and set environment variables using the virtualenv package:

pip install virtualenv
virtualenv my_project_env
source my_project_env/bin/activate

export MY_API_KEY='your-api-key-here'

# Run Jupyter Notebook with the custom environment
jupyter notebook

By doing this, all environment variables you set in the custom environment will be available to your Jupyter Notebooks.

Storing Sensitive Information Securely

If you need to store sensitive information like API keys or database credentials, it’s crucial to keep them secure. Instead of storing sensitive information in plain text in environment variables or configuration files, consider using a secrets management tool like Vault by HashiCorp or AWS Secrets Manager.

These tools store sensitive information securely and provide access control, audit logging, and other security features. You can then programmatically retrieve the secrets in your Jupyter Notebook using the tool’s API.

Conclusion

In this blog post, we’ve explored several ways to set environment variables in Jupyter Notebooks, including using the os module, loading variables from a configuration file, running the Notebook server with a custom environment, and securely storing sensitive information using secrets management tools. By properly managing your environment variables, you can ensure your projects run smoothly and securely, allowing you to focus on developing and sharing your data science work with confidence.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.