sigma_clip¶

astropy.stats.
sigma_clip
(data, sigma=3, sigma_lower=None, sigma_upper=None, maxiters=5, cenfunc='median', stdfunc='std', axis=None, masked=True, return_bounds=False, copy=True)[source]¶ Perform sigmaclipping on the provided data.
The data will be iterated over, each time rejecting values that are less or more than a specified number of standard deviations from a center value.
Clipped (rejected) pixels are those where:
data < cenfunc(data [,axis=int])  (sigma_lower * stdfunc(data [,axis=int])) data > cenfunc(data [,axis=int]) + (sigma_upper * stdfunc(data [,axis=int]))
Invalid data values (i.e. NaN or inf) are automatically clipped.
For an objectoriented interface to sigma clipping, see
SigmaClip
.Note
scipy.stats.sigmaclip provides a subset of the functionality in this class. Also, its input data cannot be a masked array and it does not handle data that contains invalid values (i.e. NaN or inf). Also note that it uses the mean as the centering function.
If your data is a
ndarray
with no invalid values and you want to use the mean as the centering function withaxis=None
and iterate to convergence, thenscipy.stats.sigmaclip
is ~2530% faster than the equivalent settings here (sigma_clip(data, cenfunc='mean', maxiters=None, axis=None)
).Parameters:  data : arraylike or
MaskedArray
The data to be sigma clipped.
 sigma : float, optional
The number of standard deviations to use for both the lower and upper clipping limit. These limits are overridden by
sigma_lower
andsigma_upper
, if input. The default is 3. sigma_lower : float or
None
, optional The number of standard deviations to use as the lower bound for the clipping limit. If
None
then the value ofsigma
is used. The default isNone
. sigma_upper : float or
None
, optional The number of standard deviations to use as the upper bound for the clipping limit. If
None
then the value ofsigma
is used. The default isNone
. maxiters : int or
None
, optional The maximum number of sigmaclipping iterations to perform or
None
to clip until convergence is achieved (i.e., iterate until the last iteration clips nothing). If convergence is achieved prior tomaxiters
iterations, the clipping iterations will stop. The default is 5. cenfunc : {‘median’, ‘mean’} or callable, optional
The statistic or callable function/object used to compute the center value for the clipping. If set to
'median'
or'mean'
then having the optional bottleneck package installed will result in the best performance. If using a callable function/object and theaxis
keyword is used, then it must be callable that can ignore NaNs (e.g.numpy.nanmean
) and has anaxis
keyword to return an array with axis dimension(s) removed. The default is'median'
. stdfunc : {‘std’} or callable, optional
The statistic or callable function/object used to compute the standard deviation about the center value. If set to
'std'
then having the optional bottleneck package installed will result in the best performance. If using a callable function/object and theaxis
keyword is used, then it must be callable that can ignore NaNs (e.g.numpy.nanstd
) and has anaxis
keyword to return an array with axis dimension(s) removed. The default is'std'
. axis :
None
or int or tuple of int, optional The axis or axes along which to sigma clip the data. If
None
, then the flattened data will be used.axis
is passed to thecenfunc
andstdfunc
. The default isNone
. masked : bool, optional
If
True
, then aMaskedArray
is returned, where the mask isTrue
for clipped values. IfFalse
, then andarray
and the minimum and maximum clipping thresholds are returned. The default isTrue
. return_bounds : bool, optional
If
True
, then the minimum and maximum clipping bounds are also returned. copy : bool, optional
If
True
, then thedata
array will be copied. IfFalse
andmasked=True
, then the returned masked array data will contain the same array as the inputdata
(ifdata
is andarray
orMaskedArray
). The default isTrue
.
Returns:  result : flexible
If
masked=True
, then aMaskedArray
is returned, where the mask isTrue
for clipped values. Ifmasked=False
, then andarray
is returned.If
return_bounds=True
, then in addition to the (masked) array above, the minimum and maximum clipping bounds are returned.If
masked=False
andaxis=None
, then the output array is a flattened 1Dndarray
where the clipped values have been removed. Ifreturn_bounds=True
then the returned minimum and maximum thresholds are scalars.If
masked=False
andaxis
is specified, then the outputndarray
will have the same shape as the inputdata
and containnp.nan
where values were clipped. Ifreturn_bounds=True
then the returned minimum and maximum clipping thresholds will be bendarray
s.
See also
Examples
This example uses a data array of random variates from a Gaussian distribution. We clip all points that are more than 2 sample standard deviations from the median. The result is a masked array, where the mask is
True
for clipped data:>>> from astropy.stats import sigma_clip >>> from numpy.random import randn >>> randvar = randn(10000) >>> filtered_data = sigma_clip(randvar, sigma=2, maxiters=5)
This example clips all points that are more than 3 sigma relative to the sample mean, clips until convergence, returns an unmasked
ndarray
, and does not copy the data:>>> from astropy.stats import sigma_clip >>> from numpy.random import randn >>> from numpy import mean >>> randvar = randn(10000) >>> filtered_data = sigma_clip(randvar, sigma=3, maxiters=None, ... cenfunc=mean, masked=False, copy=False)
This example sigma clips along one axis:
>>> from astropy.stats import sigma_clip >>> from numpy.random import normal >>> from numpy import arange, diag, ones >>> data = arange(5) + normal(0., 0.05, (5, 5)) + diag(ones(5)) >>> filtered_data = sigma_clip(data, sigma=2.3, axis=0)
Note that along the other axis, no points would be clipped, as the standard deviation is higher.
 data : arraylike or