Local cluster

import multiprocessing
from dask.distributed import Client, LocalCluster

Setup up local dask cluster

  • possibly adjust number of threads per worker
  • don’t forget to put the Client(...) in a if __name__ == "__main__" context when running from a script
n_workers = multiprocessing.cpu_count()

mem_buffer = 10 # how much memory will be spared from workers

gb_total = 128 # total memory of machine
gb_available = gb_total - mem_buffer # what is left for dask
gb_per_worker = int(gb_total / n_workers) # memory for each dask worker
client = Client(
    address=LocalCluster(
        n_workers=n_workers,
        threads_per_worker=2,
        interface="lo",
        memory_limit=f"{gb_per_worker}GB",
    )
)

Inspect link to view dashboard

print(client.dashboard_link)
http://127.0.0.1:8787/status