This document assumes that you have a Kubernetes cluster and Helm installed.

If this is not the case then you might consider setting up a Kubernetes cluster
either on one of the common cloud providers like Google, Amazon, or
Microsoft’s. We recommend the first part of the documentation in the guide
Zero to JupyterHub
that focuses on Kubernetes and Helm. You do not need to follow all of these
instructions. JupyterHub is not necessary to deploy Dask:

You can use the addresses under EXTERNAL-IP to connect to your now-running
Jupyter and Dask systems.

Notice the name bald-eel. This is the name that Helm has given to your
particular deployment of Dask. You could, for example, have multiple
Dask-and-Jupyter clusters running at once and each would be given a different
name. You will use this name to refer to your deployment in the future. You
can list all active helm deployments with:

We can navigate to these from any web browser. One is the Dask diagnostic
dashboard. The other is the Jupyter server. You can log into the Jupyter
notebook server with the password, dask.

You can create a notebook and create a Dask client from there. The
DASK_SCHEDULER_ADDRESS environment variable has been populated with the
address of the Dask scheduler. This is available in Python in the config dictionary.

By default the Helm deployment launches three workers using two cores each and
a standard conda environment. We can customize this environment by creating a
small yaml file that implements a subset of the values in the
dask helm chart values.yaml file

For example we can increase the number of workers, and include extra conda and
pip packages to install on the both the workers and Jupyter server (these two
environments should be matched).

# config.yamlworker:replicas:8limits:cpu:2memory:7.5 GiBpipPackages:>-git+https://github.com/dask/gcsfs.gitgit+https://github.com/xarray/xarray.gitcondaPackages:>--c conda-forgezarrblosc# We want to keep the same packages on the worker and jupyter environmentsjupyter:pipPackages:>-git+https://github.com/dask/gcsfs.gitgit+https://github.com/xarray/xarray.gitcondaPackages:>--c conda-forgezarrblosc

This config file overrides configuration for number and size of workers and the
conda and pip packages installed on the worker and Jupyter containers. In
general we will want to make sure that these two software environments match.

Update your deployment to use this configuration file. Note that you will not
use helm install for this stage. That would create a new deployment on the
same Kubernetes cluster. Instead you will upgrade your existing deployment by
using the current name:

helmupgradebald-eeldask/dask-fconfig.yaml

This will update those containers that need to be updated. It may take a minute or so.

As a reminder, you can list the names of deployments you have using helmlist

For standard issues you should be able to see worker status and logs using the
Dask dashboard (in particular see the worker links from the info/ page).
However if your workers aren’t starting you can check on the status of pods and
their logs with the following commands