# Information matrix

The information matrix (also called Fisher information matrix) is the matrix of second cross-moments of the score vector. The latter is the vector of first partial derivatives of the log-likelihood function with respect to its parameters.

## Definition

The information matrix is defined as follows.

Definition Let be a parameter vector characterizing the distribution of a sample . Let be the likelihood function of , depending on the parameter . Let be the log-likelihood functionDenote by the score vector, that is, the vector of first derivatives of with respect to the entries of . The information matrix is the matrix of second cross-moments of the score, defined bywhere the notation indicates that the expected value is taken with respect to the probability distribution associated to the parameter .

For example, if the sample has a continuous distribution, then the likelihood function iswhere is the probability density function of , parametrized by , and the information matrix is

## The information matrix is the covariance matrix of the score

Under mild regularity conditions, the expected value of the score is equal to zero:As a consequence,that is, the information matrix is the covariance matrix of the score.

## Information equality

Under mild regularity conditions, it can be proved thatwhere is the matrix of second-order cross-partial derivatives (so-called Hessian matrix) of the log-likelihood.

This equality is called information equality.

## Information matrix of the normal distribution

As an example, consider a sample made up of the realizations of IID normal random variables with parameters and (mean and variance).

In this case, the information matrix is

Proof

The log-likelihood function is as proved in the lecture on maximum likelihood estimation of the parameters of the normal distribution. The score is a vector whose entries are the partial derivatives of the log-likelihood with respect to and : The information matrix isWe havewhere: in step we have used the fact that for because the variables in the sample are independent and have mean equal to ; in step we have used the fact that Moreover,where: in steps and we have used the independence of the observations in the sample and in step we have used the fact that the fourth central moment of the normal distribution is equal to . Finally,where: in step we have used the facts that and that for because the variables in the sample are independent; in step we have used the fact that the third central moment of the normal distribution is equal to zero.

## More details

More details about the Fisher information matrix, including proofs of the information equality and of the fact that the expected value of the score is equal to zero, can be found in the lecture entitled Maximum likelihood.