Search for probability and statistics terms on Statlect
StatLect

Log-likelihood

by , PhD

The log-likelihood is, as the term suggests, the natural logarithm of the likelihood.

Table of Contents

But what is the likelihood?

To define the likelihood we need two things:

  1. some observed data (a sample), which we denote by $xi $ (the Greek letter xi);

  2. a set of probability distributions that could have generated the data; each distribution is identified by a parameter $	heta $ (the Greek letter theta).

Roughly speaking, the likelihood is a function[eq1] that gives us the probability of observing the sample $xi $ when the data is extracted from the probability distribution with parameter $	heta $.

Example

We will provide below a rigorous definition of log-likelihood, but it is probably a good idea to start with an example.

The typical example is the log-likelihood of a sample of independent and identically distributed draws from a normal distribution.

In this case, the sample $xi $ is a vector[eq2]whose entries [eq3] are draws from a normal distribution.

The probability density function of a draw $x_{i}$ is[eq4]where mu and sigma^2 are the parameters (mean and variance) of the normal distribution.

The parameter vector is[eq5]

The set of distributions that could have generated the sample is assumed to be the set of all normal distributions (that can be obtained by varying the parameters mu and sigma^2).

In order to stress the fact that the probability density depends on the two parameters, we write[eq6]

The joint probability density of the sample $xi $ is [eq7]because the joint density of a set of independent variables is equal to the product of their marginal densities (see the lecture on Independent random variables).

The likelihood function is[eq8]

In other words, when we deal with continuous distributions such as the normal distribution, the likelihood function is equal to the joint density of the sample. We will explain below how things change in the case of discrete distributions.

The log-likelihood function is

[eq9]

How the log-likelihood is used

The log-likelihood function is typically used to derive the maximum likelihood estimator of the parameter $	heta $.

The estimator $widehat{	heta }$ is obtained by solving[eq10]that is, by finding the parameter $widehat{	heta }$ that maximizes the log-likelihood of the observed sample $xi $.

This is the same as maximizing the likelihood function [eq11] because the natural logarithm is a strictly increasing function.

Why the log is taken

One may wonder why the log of the likelihood function is taken. There are several good reasons.

To understand them, suppose that the sample is made up of independent observations (as in the example above).

Then, the logarithm transforms a product of densities into a sum. This is very convenient because:

A rigorous definition

We finally give a rigorous definition of log-likelihood

The following elements are needed to define the log-likelihood function:

Given all these elements, the log-likelihood function is the function [eq18] defined by[eq19]

Negative log-likelihood

You will often hear the term "negative log-likelihood". It is just the log-likelihood function with a minus sign in front of it:[eq20]

It is frequently used because computer optimization algorithms are often written as minimization algorithms.

As a consequence, the maximization problem[eq21]is equivalently written in terms of the negative log-likelihood as[eq22]before being solved numerically on computers.

More examples

More examples of how to derive log-likelihood functions can be found in the lectures on:

More details

The log-likelihood and its properties are discussed in a more detailed manner in the lecture on maximum likelihood estimation.

Keep reading the glossary

Previous entry: Joint probability mass function

Next entry: Loss function

How to cite

Please cite as:

Taboga, Marco (2021). "Log-likelihood", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/log-likelihood.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.