Uploading Files to S3 via cURL Using Presigned URLs: A Guide

Data scientists often need to upload files to Amazon S3 for data storage and management. While there are several ways to accomplish this, one efficient method is using cURL with presigned URLs. This blog post will guide you through the process, step by step.

Data scientists often need to upload files to Amazon S3 for data storage and management. While there are several ways to accomplish this, one efficient method is using cURL with presigned URLs. This blog post will guide you through the process, step by step.

Table of Contents

  1. What is a Presigned URL?
  2. Why Use cURL?
  3. Prerequisites
  4. Step-by-Step
  5. Common Errors and Troubleshooting
  6. Conclusion

What is a Presigned URL?

A presigned URL is a URL that you generate to provide temporary access to an object in your S3 bucket. It’s a secure way to upload or download files without requiring AWS security credentials. The URL is generated using your own security credentials and includes a signature to authenticate your request.

Why Use cURL?

cURL is a command-line tool used for transferring data using various protocols. It’s a powerful tool that supports a wide range of protocols, including HTTP, HTTPS, FTP, and SFTP. cURL is ideal for automating file upload tasks in scripts and for use in restricted environments where full AWS SDKs might not be available.

Prerequisites

Before we start, ensure you have the following:

Step-by-Step

Step 1: Generate a Presigned URL

Open a new text file and copy this code and save it as a .py file

import boto3
s3 = boto3.client('s3')

s3 = boto3.client(
   's3',
   aws_access_key_id='your-key-id',
   aws_secret_access_key='your-secret-access-key'
   config=Config(signature_version='s3v4'
))

bucket = raw_input("Enter your Bucket Name: ")
key= raw_input("Enter your desired filename/key for this upload: ")

print (" Generating pre-signed url...")

print(s3.generate_presigned_url('put_object', Params={'Bucket':bucket,'Key':key}, ExpiresIn=3600, HttpMethod='PUT'))

Execute the script and enter your bucket name and desired filename/key

python presign.py

Output:

Alt text

Step 2: Upload File Using cURL

After generating the presigned URL, you can use cURL to upload a file. Here’s the command:

curl --request PUT --upload-file text.txt http://your-pre-signed-url.com

Replace "your-presigned-url" with the presigned URL you generated in the previous step.

Step 3: Verify the Upload

To verify the upload, you can list the objects in your S3 bucket:

aws s3 ls s3://saturn2/

You should see your uploaded file in the list.

2023-12-22 22:32:08         14 test.txt

Common Errors and Troubleshooting

Expired URLs

Presigned URLs have a limited lifespan. Ensure that you generate URLs shortly before use, and handle expired URL errors gracefully by refreshing them as needed.

Invalid Signatures

Check the integrity of your signatures. Common issues include incorrect AWS credentials or altering the URL parameters during transmission.

Permission Issues

Ensure your AWS credentials have the necessary permissions for S3 operations. Validate your IAM policies to prevent permission-related errors.

Conclusion

Uploading files to S3 using cURL and presigned URLs is a secure and efficient method, especially when dealing with large files or automating upload tasks. It’s a valuable skill for data scientists working with AWS and large datasets.

Remember, the presigned URL is temporary and expires after the specified duration. Always ensure to handle this aspect in your applications to avoid broken links or failed uploads.


About Saturn Cloud

Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.