# Box least squares (BLS) periodogram¶

The “box least squares (BLS) periodogram” [1] is a statistical tool used for
detecting transiting exoplanets and eclipsing binaries in time series
photometric data.
The main interface to this implementation is the `BoxLeastSquares`

class.

## Mathematical Background¶

The BLS method finds transit candidates by modeling a transit as a periodic upside down top hat with four parameters: period, duration, depth, and a reference time. In this implementation, the reference time is chosen to be the mid-transit time of the first transit in the observational baseline. These parameters are shown in the following sketch:

Assuming that the uncertainties on the measured flux are known, independent, and Gaussian, the maximum likelihood in-transit flux can be computed as

where \(y_n\) are the brightness measurements, \(\sigma_n\) are the associated uncertainties, and both sums are computed over the in-transit data points. Similarly, the maximum likelihood out-of-transit flux is

where these sums are over the out-of-transit observations. Using these results, the log likelihood of a transit model (maximized over depth) at a given period \(P\), duration \(\tau\), and reference time \(t_0\) is

This equation might be familiar because it is proportional to the “chi
squared” \(\chi^2\) for this model and this is a direct consequence of our
assumption of Gaussian uncertainties.
This \(\chi^2\) is called the “signal residue” by [1], so maximizing the
log likelihood over duration and reference time is equivalent to computing the
box least squares spectrum from [1].
In practice, this is achieved by finding the maximum likelihood model over a
grid in duration and reference time as specified by the `durations`

and
`oversample`

parameters for the
`BoxLeastSquares.power()`

method.
Behind the scenes, this implementation minimizes the number of required
calculations by pre-binning the observations onto a fine grid following [1]
and [2].

## Basic Usage¶

The transit periodogram takes as input time series observations where the
timestamps `t`

and the observations `y`

(usually brightness) are stored as
NumPy arrays or `Quantity`

.
If known, error bars `dy`

can also optionally be provided.
For example, to evaluate the periodogram for a simulated data set, can be
computed as follows:

```
>>> import numpy as np
>>> import astropy.units as u
>>> from astropy.stats import BoxLeastSquares
>>> np.random.seed(42)
>>> t = np.random.uniform(0, 20, 2000)
>>> y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t))
>>> model = BoxLeastSquares(t * u.day, y, dy=0.01)
>>> periodogram = model.autopower(0.2)
```

The output of the `BoxLeastSquares.autopower()`

method
is a `BoxLeastSquaresResults`

object with several
useful attributes, the most useful of which are generally the `period`

and
`power`

attributes.
This result can be plotted using matplotlib:

```
>>> import matplotlib.pyplot as plt
>>> plt.plot(periodogram.period, periodogram.power)
```

In this figure, you can see the peak at the correct period of 3 days.

## Objectives¶

By default, the `BoxLeastSquares.power()`

method computes the log
likelihood of the model fit and maximizes over reference time and duration.
It is also possible to use the signal-to-noise ratio with which the transit
depth is measured as an objective function.
To do this, call `BoxLeastSquares.power()`

or
`BoxLeastSquares.autopower()`

with `objective='snr'`

as follows:

```
>>> model = BoxLeastSquares(t * u.day, y, dy=0.01)
>>> periodogram = model.autopower(0.2, objective="snr")
```

This objective will generally produce a periodogram that is qualitatively similar to the log likelihood spectrum, but it has been used to improve the reliability of transit search in the presence of correlated noise.

## Period Grid¶

The transit periodogram is always computed on a grid of periods and the
results can be sensitive to the sampling.
As discussed in [1], the performance of the transit periodogram method is
more sensitive to the period grid than the
`LombScargle`

periodogram.
This implementation of the transit periodogram includes a conservative
heuristic for estimating the required period grid that is used by the
`BoxLeastSquares.autoperiod()`

and
`BoxLeastSquares.autopower()`

methods and the details of
this method are given in the API documentation for
`BoxLeastSquares.autoperiod()`

.
It is also possible to provide a specific period grid as follows:

```
>>> model = BoxLeastSquares(t * u.day, y, dy=0.01)
>>> periods = np.linspace(2.5, 3.5, 1000) * u.day
>>> periodogram = model.power(periods, 0.2)
```

However, if the period grid is too coarse, the correct period can easily be missed.

```
>>> model = BoxLeastSquares(t * u.day, y, dy=0.01)
>>> periods = np.linspace(0.5, 10.5, 15) * u.day
>>> periodogram = model.power(periods, 0.2)
```

## Peak Statistics¶

To help in the transit vetting process and to debug problems with candidate
peaks, the `BoxLeastSquares.compute_stats()`

method can be used to
calculate several statistics of a candidate transit.
Many of these statistics are based on the VARTOOLS package described in [2].
This will often be used as follows to compute stats for the maximum point in
the periodogram:

```
>>> model = BoxLeastSquares(t * u.day, y, dy=0.01)
>>> periodogram = model.autopower(0.2)
>>> max_power = np.argmax(periodogram.power)
>>> stats = model.compute_stats(periodogram.period[max_power],
... periodogram.duration[max_power],
... periodogram.transit_time[max_power])
```

This calculates a dictionary with statistics about this candidate.
Each entry in this dictionary is described in the documentation for
`BoxLeastSquares.compute_stats()`

.