biweight_midcorrelation#

astropy.stats.biweight.biweight_midcorrelation(x, y, c=9.0, M=None, modify_sample_size=False)[source]#

Compute the biweight midcorrelation between two variables.

The biweight midcorrelation is a measure of similarity between samples. It is given by:

\[r_{bicorr} = \frac{\zeta_{xy}}{\sqrt{\zeta_{xx} \ \zeta_{yy}}}\]

where \(\zeta_{xx}\) is the biweight midvariance of \(x\), \(\zeta_{yy}\) is the biweight midvariance of \(y\), and \(\zeta_{xy}\) is the biweight midcovariance of \(x\) and \(y\).

Parameters:

x, y1D array_like: Input arrays for the two variables. x and y must be 1D arrays and have the same number of elements.
cfloat, optional: Tuning constant for the biweight estimator (default = 9.0). See biweight_midcovariance for more details.
Mfloat or array_like, optional: The location estimate. If M is a scalar value, then its value will be used for the entire array (or along each axis, if specified). If M is an array, then its must be an array containing the location estimate along each axis of the input array. If None (default), then the median of the input array will be used (or along each axis, if specified). See biweight_midcovariance for more details.
modify_sample_sizebool, optional: If False (default), then the sample size used is the total number of elements in the array (or along the input axis, if specified), which follows the standard definition of biweight midcovariance. If True, then the sample size is reduced to correct for any rejected values (i.e. the sample size used includes only the non-rejected values), which results in a value closer to the true midcovariance for small sample sizes or for a large number of rejected values. See biweight_midcovariance for more details.

Returns:

biweight_midcorrelationfloat: The biweight midcorrelation between x and y.

See also

biweight_scale, biweight_midvariance, biweight_midcovariance, biweight_location

References

[1]

https://en.wikipedia.org/wiki/Biweight_midcorrelation

Examples

Calculate the biweight midcorrelation between two variables:

>>> import numpy as np
>>> from astropy.stats import biweight_midcorrelation
>>> rng = np.random.default_rng(12345)
>>> x = rng.normal(0, 1, 200)
>>> y = rng.normal(0, 3, 200)
>>> # Introduce an obvious outlier
>>> x[0] = 30.0
>>> bicorr = biweight_midcorrelation(x, y)
>>> print(bicorr)  
-0.09203238319481295