Release 2025.06.01

Saturn Cloud release notes for 2025.06.01

Multi-Cluster Support

  • Added Open Cluster Management (OCM) integration for deploying workloads across multiple Kubernetes clusters
  • Hub cluster coordinates spoke clusters via ManifestWork resources, with workload status aggregated back to the hub
  • Internal services (Pandora, Prometheus, Elasticsearch) can now communicate across clusters over a dedicated Traefik entrypoint with mTLS
  • Docker registry secret management decoupled from the application into a standalone cronjob, so spoke clusters no longer need a full application deployment to refresh image pull credentials
  • Workspaces can be pinned to specific clusters via a new cluster column

New Cloud Providers

  • Added support for Crusoe, Vultr, TensorWave, and k0rdent as cloud providers
  • Added GCP installer with GKE cluster provisioning, GCS object storage, and GCP node management via Terraform
  • Added OCI (Oracle Cloud Infrastructure) installer
  • Added Nebius installer with instance types for CPU and H100 GPU configurations
  • S3 bucket is now optional to support providers without S3-compatible object storage
  • Config loading supports gcs:// URLs in addition to s3://

AMD GPU Support

  • Added AMD as a hardware type alongside CPU and NVIDIA
  • GPU tolerations, labels, and scheduling are now hardware-type-aware
  • Added ROCm-based images for AMD GPU workloads

White Labeling

  • The Saturn Cloud UI can now be rebranded for partner deployments
  • Configurable brand name, logos, favicon, primary color, support email, and documentation URLs
  • Admin CLI script for configuration (setup_whitelabel.py)
  • Backend always returns white label config with Saturn Cloud defaults when not customized

Dashboard Redesign

  • New dashboard page set as the default landing page
  • Quick action items and resource templates displayed on dashboard
  • Resource-specific navigation links from dashboard
  • Updated quickstart icons and sidebar layout

Frontend Rewrite

  • Frontend migrated from Vue 2 (vue-cli/webpack) to Vue 3 with Nuxt
  • Net reduction of ~22,000 lines indicating a cleaner architecture

Resource Tags

  • Resources (workspaces, deployments) can now be tagged with arbitrary key-value metadata
  • Stored as JSONB, useful for organization and cost allocation (e.g., {"environment": "production", "project": "alpha"})

Idle Detection Overhaul

  • Workspace auto-shutdown now uses Prometheus network activity metrics instead of JupyterHub’s internal API
  • Checks Traefik open connections, Traefik request rate, and SSH connections to determine activity
  • New Prometheus recording rules aggregate HTTP and SSH activity per resource
  • Traefik and SSH proxy (sshpiper) now expose Prometheus metrics
  • Dry-run mode added for testing idle detection configuration

Usage Tracking

  • New hourly usage records table for per-hour usage tracking by subscription, user, and org
  • Admin API endpoints for submitting and querying usage records with dollar amount breakdowns
  • Usage limits card only shown when there is an hourly cap

Organization Lifecycle

  • Organizations can be locked with a reason (admin action or no payment method)
  • Org cleanup automation: configurable deletion schedules for trial expiry (7 days), no payment (30 days), and abuse (2 days)
  • Cleanup runs as a daily cronjob with dry-run support and notification scheduling
  • Free trial tracking via user metadata (IP, email domain, country, LinkedIn, job title)
  • Trusted org flag added to usage limits, propagated to K8s labels for operator enforcement

Instance Type System

  • Instance sizes now include gpu_type, hardware_type, cloud, price_per_hour, and description fields
  • Operators can define custom instance sizes in installation config that get merged at install time
  • GCP-specific ASG config supports GKE reservation affinity

Authentication

  • Auth tokens can now be read from Kubernetes-mounted secret files, with automatic re-read on rotation
  • Multi-cluster auth routing checks all cluster domains for resource access redirects
  • Auth0 signup support on enterprise deployments
  • Fixed password reset redirect to honor the next URL parameter

cert-manager Integration

  • cert-manager and Let’s Encrypt integration added as optional apps
  • httpreq-webhook support for DNS-01 challenges
  • Runs on all cloud providers, with EKS-specific compatibility settings

Bug Fixes

  • Fixed git repo URL validation to handle URLs without .git suffix and Azure DevOps SSH URLs
  • Fixed org management endpoint to require admin permissions for PATCH operations
  • Fixed wrong owner displayed when editing secrets
  • Fixed cloud selection locked after resource creation (preventing accidental changes)
  • Fixed progress bar overflow on resource cards

Infrastructure

  • Deprecated enforce_min_TLS12 and enable_LB_access_logs config options removed
  • Load balancer extra security groups moved to the cloud-specific config section
  • Installer container image now includes Google Cloud SDK, Nebius CLI, OCI CLI, and clusteradm
  • Configurable storage classes for Elasticsearch and Prometheus (replacing hardcoded values)
  • Network policies can be toggled off via disableNetworkPolicy flag