Frequently Asked Questions


Does Dask use multiple processors?

Dask can use multiple processes on the same machine to speed up computations, as well as having multiple machines work together concurrently (or often both!). This depends on which Dask scheduler you use, check out our [Dask cheatsheet](https://saturncloud.io/get-content/dask-cheatsheet/) for more details.

Is Dask better than pandas?

Dask and pandas have different uses–pandas is great for doing data science and machine learning with data and models that fit onto a single machine. Dask expands to work across many machines at the same time, all with code that looks nearly identical to pandas. Dask is often better when your data or models are too large for just one machine.

Is Dask free?

Yes, Dask is free on Saturn Cloud for anyone. You can use Dask on Saturn Cloud or connect to Saturn Cloud from AWS Sagemaker, Microsoft Azure and Google Colab (GCP). The free tier also includes free cloud-hosted Jupyter notebooks as well as free GPUs. Join thousands of data scientists using Saturn Cloud to parallelize code with Dask and across GPUs. Sign up here to begin using Dask in the cloud.

What is a Dask DataFrame?

A Dask DataFrame is a data type from Dask that mimics a Pandas DataFrame. They both have nearly identical syntax, however the Dask DataFrame can have its computations easily parallelized across many Dask workers in a cluster. To the end user it seems just like a traditional Pandas DataFrame only on large data it runs much faster. See our [Dask cheatsheet](https://saturncloud.io/get-content/dask-cheatsheet/) for more details.

What is Dask good for?

Dask is great for doing data science, machine learning, and analytics on large data sets that can’t fit into memory on a single computer or would benefit from running in parallel. See our [Dask page](https://saturncloud.io/dask/about/) for more details.

Who uses Dask?

Dask is used by data scientists, machine learning engineers, high performance computing experts, and anyone who might benefit from running Python across multiple machines concurrently.