 StatLect

# EM algorithm

The Expectation-Maximization (EM) algorithm is a recursive algorithm that can be used to search for the maximum likelihood estimators of model parameters when the model includes some unobservable variables (also called latent variables). ## Latent-variable model

In a latent-variable model, there are two vectors:

• the observed data ;

• the vector of unobserved variables .

We denote their joint probability by where is a vector of parameters to be estimated.

## Model specification

The joint distribution of and is usually obtained by separately specifying the conditional distribution of the observed variables and the marginal distribution of the latent variables As a consequence, the joint distribution is In the context of latent-variable models, the joint distribution is often called complete likelihood.

## Derived quantities

Given the model specification above, if is a discrete random vector, we can derive the marginal distribution of as where the sum is over the set of all the values that can possibly take (its support).

We can also derive the conditional distribution of as If is continuous the sums above are replaced by integrals.

## Maximum likelihood problem

The maximum likelihood estimator (MLE) of the parameter is the vector that solves the maximization problem This problem does not usually have an analytical solution. As a consequence, we need to solve it with an iterative procedure that starts from an initial guess of the solution and produces a sequence that hopefully converges to the solution.

We say hopefully because we are often unable to derive theoretical guarantees about the numerical convergence of likelihood maximization algorithms, especially for latent-variable models.

The EM algorithm is one of the iterative procedures that can be used to search for a solution when we are dealing with a latent-variable model specified as above.

## The algorithm

Starting from an initial guess , the -th iteration of the EM algorithm consists of the following steps:

1. use the parameter value found in the previous iteration to compute the conditional probabilities for each ;

2. use the conditional probabilities derived in step 1 to compute the expected value of the complete log-likelihood: 3. solve the maximization problem 4. if the parameter update is smaller than a pre-specified threshold , that is, if stop the algorithm, else return to step 1.

Steps 1 and 2 are collectively called the Expectation step, while step 3 is called the Maximization step. Hence the name of the algorithm (Expectation-Maximization).

Step 4 is a stopping criterion: we stop the algorithm when there are no significant changes in the parameter.

## Theoretical guarantees

In general, the algorithm is not guaranteed to converge to a global maximum of the likelihood.

However, it has the important property that the likelihood is guaranteed to increase at each iteration: for every .

Proof

The conditional probability formula implies that If we multiply both sides by , we obtain Then, we can sum over the support of : Since the log-likelihood does not depend on and we have Moreover, Thus, Setting , we obtain Subtracting (2) from (1), we get By Jensen's inequality, we have Therefore, which, for , becomes But because Therefore or ## Advantages of the EM algorithm

The EM algorithm is particularly advantageous when the maximization problem in the Maximization step has a closed-form solution.

This happens, for example, when the latent-variable model is a mixture of multivariate normal distributions.

## Caveats

As we said, there is no guarantee that the EM algorithm converges to a global maximum of the likelihood.

If we suspect that the likelihood may have multiple local minima, we should use the multiple starts approach. In other words, we should run the EM algorithm several times with different starting values .

## Example: estimation of Gaussian mixtures

The EM algorithm is often used to estimate the parameters of a mixture of normal distributions (also called Gaussian mixture). See this lecture for details.