biweight_midcorrelation

astropy.stats.biweight.biweight_midcorrelation(x, y, c=9.0, M=None, modify_sample_size=False)[source] [edit on github]

Compute the biweight midcorrelation between two variables.

The biweight midcorrelation is a measure of similarity between samples. It is given by:

\[r_{bicorr} = \frac{\zeta_{xy}}{\sqrt{\zeta_{xx} \ \zeta_{yy}}}\]

where \(\zeta_{xx}\) is the biweight midvariance of \(x\), \(\zeta_{yy}\) is the biweight midvariance of \(y\), and \(\zeta_{xy}\) is the biweight midcovariance of \(x\) and \(y\).

Parameters:
x, y : 1D array-like

Input arrays for the two variables. x and y must be 1D arrays and have the same number of elements.

c : float, optional

Tuning constant for the biweight estimator (default = 9.0). See biweight_midcovariance for more details.

M : float or array-like, optional

The location estimate. If M is a scalar value, then its value will be used for the entire array (or along each axis, if specified). If M is an array, then its must be an array containing the location estimate along each axis of the input array. If None (default), then the median of the input array will be used (or along each axis, if specified). See biweight_midcovariance for more details.

modify_sample_size : bool, optional

If False (default), then the sample size used is the total number of elements in the array (or along the input axis, if specified), which follows the standard definition of biweight midcovariance. If True, then the sample size is reduced to correct for any rejected values (i.e. the sample size used includes only the non-rejected values), which results in a value closer to the true midcovariance for small sample sizes or for a large number of rejected values. See biweight_midcovariance for more details.

Returns:
biweight_midcorrelation : float

The biweight midcorrelation between x and y.

References

[1]https://en.wikipedia.org/wiki/Biweight_midcorrelation

Examples

Calculate the biweight midcorrelation between two variables:

>>> import numpy as np
>>> from astropy.stats import biweight_midcorrelation
>>> rng = np.random.RandomState(12345)
>>> x = rng.normal(0, 1, 200)
>>> y = rng.normal(0, 3, 200)
>>> # Introduce an obvious outlier
>>> x[0] = 30.0
>>> bicorr = biweight_midcorrelation(x, y)
>>> print(bicorr)    
-0.0495780713907