dask.dataframe.compute
dask.dataframe.compute¶
- dask.dataframe.compute(*args, traverse=True, optimize_graph=True, scheduler=None, get=None, **kwargs)[source]¶
Compute several dask collections at once.
- Parameters
- argsobject
Any number of objects. If it is a dask object, it’s computed and the result is returned. By default, python builtin collections are also traversed to look for dask objects (for more information see the
traverse
keyword). Non-dask arguments are passed through unchanged.- traversebool, optional
By default dask traverses builtin python collections looking for dask objects passed to
compute
. For large collections this can be expensive. If none of the arguments contain any dask objects, settraverse=False
to avoid doing this traversal.- schedulerstring, optional
Which scheduler to use like “threads”, “synchronous” or “processes”. If not provided, the default is to check the global settings first, and then fall back to the collection defaults.
- optimize_graphbool, optional
If True [default], the optimizations for each collection are applied before computation. Otherwise the graph is run as is. This can be useful for debugging.
- get
None
Should be left to
None
The get= keyword has been removed.- kwargs
Extra keywords to forward to the scheduler function.
Examples
>>> import dask >>> import dask.array as da >>> a = da.arange(10, chunks=2).sum() >>> b = da.arange(10, chunks=2).mean() >>> dask.compute(a, b) (45, 4.5)
By default, dask objects inside python collections will also be computed:
>>> dask.compute({'a': a, 'b': b, 'c': 1}) ({'a': 45, 'b': 4.5, 'c': 1},)