Search for probability and statistics terms on Statlect
StatLect

Conjugate prior

by , PhD

In Bayesian inference, the prior distribution of a parameter and the likelihood of the observed data are combined to obtain the posterior distribution of the parameter.

If the prior and the posterior belong to the same parametric family, then the prior is said to be conjugate for the likelihood.

Table of Contents

Review of parametric families

First of all, let us review the concept of a parametric family.

Let a set of probability distributions $Phi $ be put in correspondence with a parameter space [eq1].

If the correspondence is a function (i.e., it associates one and only one distribution in $Phi $ to each parameter $	heta in Theta $), then $Phi $ is called a parametric family.

Examples of parametric families

Examples of parametric families are:

Prior, likelihood and posterior

In a Bayesian inference problem, we specify two distributions:

After observing the data, we use Bayes' rule to compute the posterior distribution[eq5]

Definition of a conjugate prior

We can now define the concept of a conjugate prior.

Definition Let $Phi $ be a parametric family. A prior [eq6] belonging to $Phi $ is said to be conjugate for the likelihood [eq7] if and only if the posterior [eq8] belongs to $Phi $.

In other words, when we use a conjugate prior, the posterior resulting from the Bayesian updating process is in the same parametric family as the prior.

Example

In the lecture on Bayesian inference about the mean of a normal distribution, we have already encountered a conjugate prior.

In that lecture, x is a vector of IID draws from a normal distribution having unknown mean mu and known variance sigma^2. Moreover, both the prior and the posterior distribution of the parameter mu are normal. Hence, the prior and the posterior belong to the same parametric family of normal distributions, and the prior is conjugate with the likelihood.

Usefulness of conjugate priors

There are basically two reasons why models with conjugate priors are popular (e.g., Robert 2007, Bernardo and Smith 2009):

  1. they usually allow us to derive a closed-form expression for the posterior distribution;

  2. they are easy to interpret, as we can easily see how the parameters of the prior change after the Bayesian update.

Exponential families

Conjugate priors are easily characterized when the distribution of x belongs to an exponential family, in which case the likelihood takes the form[eq9]where:

A parametric family of conjugate priors for the above likelihood is formed by all the distributions such that[eq14]where:

The parameters $chi $ and $
u $ are called hyperparameters.

Note that[eq17]implies that[eq18]

As a consequence, the above parametric family of conjugate priors, called a natural family, contains all the distributions associated to couples of hyperparameters [eq19] such that the integral in the denominator is well-defined and finite.

Given the likelihood and the prior, the posterior is[eq20]provided [eq21] is well-defined.

Proof

The posterior is proportional to the prior times the likelihood:[eq22]Therefore,[eq23]where we know that the constant of proportionality is [eq24].

Examples of natural families

Let us now see how powerful the technology of natural families is, by deriving the conjugate priors of some common distributions.

Bernoulli likelihood and beta priors

Remember that a Bernoulli random variable is equal to 1 with probability $q$ and to 0 with probability $1-q$.

Suppose that we observe a realization x of the Bernoulli variable and we want to carry out some Bayesian inference on the unknown parameter $q$.

The likelihood has exponential form:[eq25]where [eq26] is an indicator function equal to 1 if [eq27] and to 0 otherwise, and [eq28]

The natural family of conjugate priors contains priors of the form[eq29]

Since $	heta $ is an increasing function of $q$ and[eq30]we can apply the formula for the density of an increasing function:[eq31]

Thus, the natural family of conjugate priors contains priors that assign to $q$ a Beta distribution with parameters $chi $ and $
u -chi $.

According to the general formula derived above for natural families, the posterior distribution of $	heta $ is[eq32]which implies (by the same argument just used for the prior) that the posterior distribution of $q$ is[eq33]that is, a Beta distribution with parameters $chi +x$ and $
u -chi +1-x$.

Poisson likelihood and Gamma prior

If x has a Poisson distribution, its likelihood is[eq34]where $lambda $ is a parameter and $U{2124} _{+}$ is the set of non-negative integer numbers.

We can write the likelihood in exponential form:[eq35]where[eq36]

The natural family of conjugate priors contains priors of the form[eq37]

Since $	heta $ is an increasing function of $lambda $ and[eq38]we can apply the formula for the density of an increasing function:[eq39]

Thus, the natural family of conjugate priors contains priors that assign to $lambda $ a Gamma distribution with parameters $n=2chi $ and $h=chi /n$.

By the general formula for natural families, the posterior distribution of $	heta $ is[eq40]which implies (by the same argument just used for the prior) that the posterior distribution of $lambda $ is[eq41]that is, a Gamma distribution with parameters [eq42] and [eq43].

References

Bernardo, J. M., and Smith, A. F. M. (2009) Bayesian Theory, Wiley.

Robert, C. P. (2007) The Bayesian Choice, Springer.

How to cite

Please cite as:

Taboga, Marco (2021). "Conjugate prior", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/conjugate-prior.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.