How to Read Data from Google Sheets using Colaboratory Google
As a software engineer, you know how important it is to have access to data for your projects. Google Sheets is a popular online spreadsheet platform that many people use to store and share data. In this blog post, we’ll show you how to read data from Google Sheets using Colaboratory, a free Jupyter notebook environment from Google.
Table of Contents
- Step 1: Open Colab
- Step 2: Mount Google Drive
- Step 3: Read Data from Google Sheet
- Step 4: Analyze the Data
- Alternative Approach: Using gspread with Colab
Colaboratory, or Colab for short, is an online platform provided by Google that allows you to write and run Python code in a web browser. One of the key benefits of Colab is that it integrates with Google Drive, allowing you to easily access and manipulate data stored in your Drive.
In this tutorial, we’ll show you how to read data from a Google Sheet stored in Google Drive using Colab. We’ll assume that you already have a Google account and that you have some data stored in a Google Sheet.
Step 1: Open Colab
To get started, open Colab in your web browser. You can do this by going to https://colab.research.google.com/ and signing in with your Google account.
Once you’re signed in, you’ll see the Colab home page. From here, you can create a new notebook or open an existing one.
Step 2: Mount Google Drive
Before we can read data from Google Drive, we need to mount our Drive in Colab. To do this, run the following code:
from google.colab import drive
This code will prompt you to authorize Colab to access your Google Drive. Follow the instructions to grant access, and then enter the authorization code that is provided.
Once you’ve done this, your Google Drive will be mounted in Colab and you’ll be able to access your files.
Step 3: Read Data from Google Sheet
Now that we’ve mounted our Google Drive, we can read data from a Google Sheet. To do this, we’ll use the
pandas library, which is a popular Python library for data manipulation.
First, let’s import the
import pandas as pd
Next, we’ll use the
read_csv function from
pandas to read the data from our Google Sheet. The
read_csv function is used to read data from a CSV file, but it can also be used to read data from a Google Sheet if we provide the URL of the sheet.
To get the URL of our Google Sheet, we’ll need to open the sheet in Google Drive and copy the URL from the address bar. The URL should look something like this:
ABC123XYZ456 with the ID of your Google Sheet.
Now, let’s use the
read_csv function to read the data from our Google Sheet:
url = 'https://docs.google.com/spreadsheets/d/ABC123XYZ456/export?format=csv'
df = pd.read_csv(url)
In this code, we’ve provided the URL of our Google Sheet to the
read_csv function. We’ve also specified the
format parameter as
csv, which tells Google Drive to export the sheet as a CSV file.
read_csv function returns a DataFrame object, which is a two-dimensional tabular data structure with labeled axes. We’ve assigned this object to the variable
Step 4: Analyze the Data
Now that we’ve read our data into a DataFrame, we can analyze it using the tools provided by
For example, we can use the
head function to display the first few rows of our data:
We can also use the
describe function to get a summary of our data:
These are just a few examples of the many ways that you can analyze your data using
Alternative Approach: Using gspread with Colab
While the previous method demonstrated the use of pandas for data retrieval from Google Drive in Colab, an alternative approach involves leveraging the gspread library. This library provides direct integration with Google Sheets, offering a streamlined process for accessing data stored in Google Drive.
# Install the required libraries
!pip install gspread google-auth
# Import libraries
from google.colab import auth
from oauth2client.client import GoogleCredentials
from google.auth import default
# Authenticate and create a client
creds, _ = default()
gc = gspread.authorize(creds)
# Open the Google Sheets file (replace 'your_spreadsheet_name' with your file's name)
spreadsheet = gc.open("your_spreadsheet_name").sheet1
# Now you can work with the 'worksheet' object, for example, fetching the data
data = spreadsheet.get_all_values()
- Install Libraries: Begin by installing the necessary libraries, gspread, and google-auth.
- Import Libraries: Import the required libraries, including gspread for Google Sheets integration.
- Authenticate and Create a Client: Use the authentication code to authenticate the user and create a gspread client for accessing Google Sheets.
- Open Google Sheets File: Replace ‘your_spreadsheet_name’ with the actual name of your Google Sheets file. This step opens the specified sheet and allows you to work with its data.
- Work with Worksheet Object: Once the sheet is open, you can perform various operations on the ‘worksheet’ object. In the provided example, the
get_all_values()method is used to fetch all the data from the sheet.
This alternative method offers a more direct and specialized approach for working with Google Sheets, providing a viable option based on the specific needs of your project.
In this blog post, we’ve shown you how to read data from Google Sheets using Colaboratory. By following these steps, you should now be able to access and manipulate data stored in your Google Sheets.
Colaboratory is a powerful tool for data analysis and machine learning, and its integration with Google Drive makes it even more useful.
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Join today and get 150 hours of free compute per month.