Dask enables parallel computing through task scheduling and blocked algorithms.
This allows developers to write complex parallel algorithms and execute them
in parallel either on a modern multi-core machine or on a distributed cluster.

On a single machine dask increases the scale of comfortable data from
fits-in-memory to fits-on-disk by intelligently streaming data from disk
and by leveraging all the cores of a modern CPU.

Users interact with dask either by making graphs directly or through the dask
collections which provide larger-than-memory counterparts to existing popular
libraries:

dask.array = numpy + threading

dask.bag = map,filter,toolz + multiprocessing

dask.dataframe = pandas + threading

Dask primarily targets parallel computations that run on a single machine. It
integrates nicely with the existing PyData ecosystem and is trivial to setup
and use:

Dask graphs encode algorithms in a simple format involving Python dicts,
tuples, and functions. This graph format can be used in isolation from the
dask collections. If you are a developer then you should start here.