dask.dataframe.DataFrame.dropna
dask.dataframe.DataFrame.dropna¶
- DataFrame.dropna(how=_NoDefault.no_default, subset=None, thresh=_NoDefault.no_default)[source]¶
Remove missing values.
This docstring was copied from pandas.core.frame.DataFrame.dropna.
Some inconsistencies with the Dask version may exist.
See the User Guide for more on which values are considered missing, and how to work with missing data.
- Parameters
- axis{0 or ‘index’, 1 or ‘columns’}, default 0 (Not supported in Dask)
Determine if rows or columns which contain missing values are removed.
0, or ‘index’ : Drop rows which contain missing values.
1, or ‘columns’ : Drop columns which contain missing value.
Only a single axis is allowed.
- how{‘any’, ‘all’}, default ‘any’
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
‘any’ : If any NA values are present, drop that row or column.
‘all’ : If all values are NA, drop that row or column.
- threshint, optional
Require that many non-NA values. Cannot be combined with how.
- subsetcolumn label or sequence of labels, optional
Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.
- inplacebool, default False (Not supported in Dask)
Whether to modify the DataFrame rather than creating a new one.
- ignore_indexbool, default
False
(Not supported in Dask) If
True
, the resulting axis will be labeled 0, 1, …, n - 1.New in version 2.0.0.
- Returns
- DataFrame or None
DataFrame with NA entries dropped from it or None if
inplace=True
.
See also
DataFrame.isna
Indicate missing values.
DataFrame.notna
Indicate existing (non-missing) values.
DataFrame.fillna
Replace missing values.
Series.dropna
Drop missing values.
Index.dropna
Drop missing indices.
Examples
>>> df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'], ... "toy": [np.nan, 'Batmobile', 'Bullwhip'], ... "born": [pd.NaT, pd.Timestamp("1940-04-25"), ... pd.NaT]}) >>> df name toy born 0 Alfred NaN NaT 1 Batman Batmobile 1940-04-25 2 Catwoman Bullwhip NaT
Drop the rows where at least one element is missing.
>>> df.dropna() name toy born 1 Batman Batmobile 1940-04-25
Drop the columns where at least one element is missing.
>>> df.dropna(axis='columns') name 0 Alfred 1 Batman 2 Catwoman
Drop the rows where all elements are missing.
>>> df.dropna(how='all') name toy born 0 Alfred NaN NaT 1 Batman Batmobile 1940-04-25 2 Catwoman Bullwhip NaT
Keep only the rows with at least 2 non-NA values.
>>> df.dropna(thresh=2) name toy born 1 Batman Batmobile 1940-04-25 2 Catwoman Bullwhip NaT
Define in which columns to look for missing values.
>>> df.dropna(subset=['name', 'toy']) name toy born 1 Batman Batmobile 1940-04-25 2 Catwoman Bullwhip NaT