dask.dataframe.compute

dask.dataframe.compute¶

dask.dataframe.compute(*args, traverse=True, optimize_graph=True, scheduler=None, get=None, **kwargs)[source]¶

Compute several dask collections at once.

Parameters

argsobject: Any number of objects. If it is a dask object, it’s computed and the result is returned. By default, python builtin collections are also traversed to look for dask objects (for more information see the traverse keyword). Non-dask arguments are passed through unchanged.
traversebool, optional: By default dask traverses builtin python collections looking for dask objects passed to compute. For large collections this can be expensive. If none of the arguments contain any dask objects, set traverse=False to avoid doing this traversal.
schedulerstring, optional: Which scheduler to use like “threads”, “synchronous” or “processes”. If not provided, the default is to check the global settings first, and then fall back to the collection defaults.
optimize_graphbool, optional: If True [default], the optimizations for each collection are applied before computation. Otherwise the graph is run as is. This can be useful for debugging.
getNone: Should be left to None The get= keyword has been removed.
kwargs: Extra keywords to forward to the scheduler function.

Examples

>>> import dask
>>> import dask.array as da
>>> a = da.arange(10, chunks=2).sum()
>>> b = da.arange(10, chunks=2).mean()
>>> dask.compute(a, b)
(45, 4.5)

By default, dask objects inside python collections will also be computed:

>>> dask.compute({'a': a, 'b': b, 'c': 1})
({'a': 45, 'b': 4.5, 'c': 1},)

dask.dataframe.utils.make_meta

dask.dataframe.map_partitions