astropy.stats.bayesian_info_criterion(log_likelihood, n_params, n_samples)[source]#

Computes the Bayesian Information Criterion (BIC) given the log of the likelihood function evaluated at the estimated (or analytically derived) parameters, the number of parameters, and the number of samples.

The BIC is usually applied to decide whether increasing the number of free parameters (hence, increasing the model complexity) yields significantly better fittings. The decision is in favor of the model with the lowest BIC.

BIC is given as

\[\mathrm{BIC} = k \ln(n) - 2L,\]

in which \(n\) is the sample size, \(k\) is the number of free parameters, and \(L\) is the log likelihood function of the model evaluated at the maximum likelihood estimate (i. e., the parameters for which L is maximized).

When comparing two models define \(\Delta \mathrm{BIC} = \mathrm{BIC}_h - \mathrm{BIC}_l\), in which \(\mathrm{BIC}_h\) is the higher BIC, and \(\mathrm{BIC}_l\) is the lower BIC. The higher is \(\Delta \mathrm{BIC}\) the stronger is the evidence against the model with higher BIC.

The general rule of thumb is:

\(0 < \Delta\mathrm{BIC} \leq 2\): weak evidence that model low is better

\(2 < \Delta\mathrm{BIC} \leq 6\): moderate evidence that model low is better

\(6 < \Delta\mathrm{BIC} \leq 10\): strong evidence that model low is better

\(\Delta\mathrm{BIC} > 10\): very strong evidence that model low is better

For a detailed explanation, see [1] - [5].


Logarithm of the likelihood function of the model evaluated at the point of maxima (with respect to the parameter space).


Number of free parameters of the model, i.e., dimension of the parameter space.


Number of observations.


Bayesian Information Criterion.


[1] (1,2)

Richards, D. Maximum Likelihood Estimation and the Bayesian Information Criterion. <>


Wikipedia. Bayesian Information Criterion. <>


Origin Lab. Comparing Two Fitting Functions. <>


Liddle, A. R. Information Criteria for Astrophysical Model Selection. 2008. <>


Liddle, A. R. How many cosmological parameters? 2008. <>


The following example was originally presented in [1]. Consider a Gaussian model (mu, sigma) and a t-Student model (mu, sigma, delta). In addition, assume that the t model has presented a higher likelihood. The question that the BIC is proposed to answer is: “Is the increase in likelihood due to larger number of parameters?”

>>> from astropy.stats.info_theory import bayesian_info_criterion
>>> lnL_g = -176.4
>>> lnL_t = -173.0
>>> n_params_g = 2
>>> n_params_t = 3
>>> n_samples = 100
>>> bic_g = bayesian_info_criterion(lnL_g, n_params_g, n_samples)
>>> bic_t = bayesian_info_criterion(lnL_t, n_params_t, n_samples)
>>> bic_g - bic_t 

Therefore, there exist a moderate evidence that the increasing in likelihood for t-Student model is due to the larger number of parameters.