sigma_clip¶
- astropy.stats.sigma_clipping.sigma_clip(data, sigma=3, sigma_lower=None, sigma_upper=None, maxiters=5, cenfunc='median', stdfunc='std', axis=None, masked=True, return_bounds=False, copy=True, grow=False)[source]¶
Perform sigma-clipping on the provided data.
The data will be iterated over, each time rejecting values that are less or more than a specified number of standard deviations from a center value.
Clipped (rejected) pixels are those where:
data < center - (sigma_lower * std) data > center + (sigma_upper * std)
where:
center = cenfunc(data [, axis=]) std = stdfunc(data [, axis=])
Invalid data values (i.e., NaN or inf) are automatically clipped.
For an object-oriented interface to sigma clipping, see
SigmaClip.Note
scipy.stats.sigmaclipprovides a subset of the functionality in this class. Also, its input data cannot be a masked array and it does not handle data that contains invalid values (i.e., NaN or inf). Also note that it uses the mean as the centering function. The equivalent settings toscipy.stats.sigmaclipare:sigma_clip(sigma=4., cenfunc='mean', maxiters=None, axis=None, ... masked=False, return_bounds=True)
- Parameters:
- dataarray_like or
MaskedArray The data to be sigma clipped.
- sigma
float, optional The number of standard deviations to use for both the lower and upper clipping limit. These limits are overridden by
sigma_lowerandsigma_upper, if input. The default is 3.- sigma_lower
floatorNone, optional The number of standard deviations to use as the lower bound for the clipping limit. If
Nonethen the value ofsigmais used. The default isNone.- sigma_upper
floatorNone, optional The number of standard deviations to use as the upper bound for the clipping limit. If
Nonethen the value ofsigmais used. The default isNone.- maxiters
intorNone, optional The maximum number of sigma-clipping iterations to perform or
Noneto clip until convergence is achieved (i.e., iterate until the last iteration clips nothing). If convergence is achieved prior tomaxitersiterations, the clipping iterations will stop. The default is 5.- cenfunc{‘median’, ‘mean’} or
callable(), optional The statistic or callable function/object used to compute the center value for the clipping. If using a callable function/object and the
axiskeyword is used, then it must be able to ignore NaNs (e.g.,numpy.nanmean) and it must have anaxiskeyword to return an array with axis dimension(s) removed. The default is'median'.- stdfunc{‘std’, ‘mad_std’} or
callable(), optional The statistic or callable function/object used to compute the standard deviation about the center value. If using a callable function/object and the
axiskeyword is used, then it must be able to ignore NaNs (e.g.,numpy.nanstd) and it must have anaxiskeyword to return an array with axis dimension(s) removed. The default is'std'.- axis
Noneorintortupleofint, optional The axis or axes along which to sigma clip the data. If
None, then the flattened data will be used.axisis passed to thecenfuncandstdfunc. The default isNone.- maskedbool, optional
If
True, then aMaskedArrayis returned, where the mask isTruefor clipped values. IfFalse, then andarrayis returned. The default isTrue.- return_boundsbool, optional
If
True, then the minimum and maximum clipping bounds are also returned.- copybool, optional
If
True, then thedataarray will be copied. IfFalseandmasked=True, then the returned masked array data will contain the same array as the inputdata(ifdatais andarrayorMaskedArray). IfFalseandmasked=False, the input data is modified in-place. The default isTrue.- grow
floatorFalse, optional Radius within which to mask the neighbouring pixels of those that fall outwith the clipping limits (only applied along
axis, if specified). As an example, for a 2D image a value of 1 will mask the nearest pixels in a cross pattern around each deviant pixel, while 1.5 will also reject the nearest diagonal neighbours and so on.
- dataarray_like or
- Returns:
- resultarray_like
If
masked=True, then aMaskedArrayis returned, where the mask isTruefor clipped values and where the input mask wasTrue.If
masked=False, then andarrayis returned.If
return_bounds=True, then in addition to the masked array or array above, the minimum and maximum clipping bounds are returned.If
masked=Falseandaxis=None, then the output array is a flattened 1Dndarraywhere the clipped values have been removed. Ifreturn_bounds=Truethen the returned minimum and maximum thresholds are scalars.If
masked=Falseandaxisis specified, then the outputndarraywill have the same shape as the inputdataand containnp.nanwhere values were clipped. If the inputdatawas a masked array, then the outputndarraywill also containnp.nanwhere the input mask wasTrue. Ifreturn_bounds=Truethen the returned minimum and maximum clipping thresholds will be bendarrays.
See also
Notes
The best performance will typically be obtained by setting
cenfuncandstdfuncto one of the built-in functions specified as as string. If one of the options is set to a string while the other has a custom callable, you may in some cases see better performance if you have the bottleneck package installed.Examples
This example uses a data array of random variates from a Gaussian distribution. We clip all points that are more than 2 sample standard deviations from the median. The result is a masked array, where the mask is
Truefor clipped data:>>> from astropy.stats import sigma_clip >>> from numpy.random import randn >>> randvar = randn(10000) >>> filtered_data = sigma_clip(randvar, sigma=2, maxiters=5)
This example clips all points that are more than 3 sigma relative to the sample mean, clips until convergence, returns an unmasked
ndarray, and does not copy the data:>>> from astropy.stats import sigma_clip >>> from numpy.random import randn >>> from numpy import mean >>> randvar = randn(10000) >>> filtered_data = sigma_clip(randvar, sigma=3, maxiters=None, ... cenfunc=mean, masked=False, copy=False)
This example sigma clips along one axis:
>>> from astropy.stats import sigma_clip >>> from numpy.random import normal >>> from numpy import arange, diag, ones >>> data = arange(5) + normal(0., 0.05, (5, 5)) + diag(ones(5)) >>> filtered_data = sigma_clip(data, sigma=2.3, axis=0)
Note that along the other axis, no points would be clipped, as the standard deviation is higher.