StatlectThe Digital Textbook

# Central Limit Theorem

Let be a sequence of random variables. Let be the sample mean of the first terms of the sequence:

A Central Limit Theorem (CLT) is a proposition stating a set of conditions that are sufficient to guarantee the convergence of the sample mean to a normal distribution, as the sample size increases.

More precisely, a Central Limit Theorem is a proposition giving a set of conditions that are sufficient to guarantee that:where is a standard normal random variable (i.e. a normal random variable with zero mean and unit variance), and are two constants and indicates convergence in distribution.

Why is the ratio multiplied by the square root of ? If we do not multiply it by , then converges to a constant, provided that the conditions of a Law of Large Numbers apply. On the contrary, multiplying it by , we obtain a sequence that converges to a proper random variable (i.e. a random variable that is not constant). When the conditions of a Central Limit Theorem apply, this variable has a normal distribution.

In practice, the CLT is used as follows:

1. we observe a sample consisting of observations , , , ;

2. if is large enough, then a standard normal distribution is a good approximation of the distribution of ;

3. therefore, we pretend that:where indicates the normal distribution with mean and variance ;

4. as a consequence, the distribution of the sample mean is:

There are several Central Limit Theorems. We report some examples below.

## Examples

### Lindeberg-Lévy Central Limit Theorem

The best known Central Limit Theorem is probably Lindeberg-Lévy CLT:

Proposition (Lindeberg-Lévy CLT) Let be an IID sequence of random variables such that:where . Then, a Central Limit Theorem applies to the sample mean :where is a standard normal random variable and denotes convergence in distribution.

Proof

We will just sketch a proof. For a detailed and rigorous proof see, for example: Resnick (1999) and Williams (1991). First of all, denote by the sequence whose generic term is:The characteristic function of is:Now take a second order Taylor series expansion of around the point :where is an infinitesimal of higher order than , i.e. a quantity that converges to faster than does. Therefore:So, we have that:where is the characteristic function of a standard normal random variable (see the lecture entitled Normal distribution). A theorem, called Lévy continuity theorem, which we do not cover in these lectures, states that if a sequence of random variables is such that their characteristic functions converge to the characteristic function of a random variable , then the sequence converges in distribution to . Therefore, in our case the sequence converges in distribution to a standard normal distribution.

So, roughly speaking, under the stated assumptions, the distribution of the sample mean can be approximated by a normal distribution with mean and variance (provided is large enough).

Also note that the conditions for the validity of Lindeberg-Lévy Central Limit Theorem resemble the conditions for the validity of Kolmogorov's Strong Law of Large Numbers. The only difference is the additional requirement that:

### The Central Limit Theorem for correlated sequences

In the Lindeberg-Lévy CLT (see above), the sequence is required to be an IID sequence. The assumption of independence can be weakened as follows:

Proposition (CLT for correlated sequences) Let be a stationary and mixing sequence of random variables satisfying a CLT technical condition (defined in the proof below) and such that:where . Then, a Central Limit Theorem applies to the sample mean :where is a standard normal random variable and indicates convergence in distribution.

Proof

Several different technical conditions (beyond those explicitly stated in the above proposition) are imposed in the literature in order to derive Central Limit Theorems for correlated sequences. These conditions are usually very mild and differ from author to author. We do not mention these technical conditions here and just refer to them as CLT technical conditions.

For a proof, see for example Durrett (2010) and White (2001).

So, roughly speaking, under the stated assumptions, the distribution of the sample mean can be approximated by a normal distribution with mean and variance (provided is large enough).

Also note that the conditions for the validity of the Central Limit Theorem for correlated sequences resemble the conditions for the validity of the ergodic theorem. The main differences (beyond some technical conditions that are not explicitly stated in the above proposition) are the additional requirements that:and the fact that ergodicity is replaced by the stronger condition of mixing.

Finally, let us mention that the variance in the above proposition, which is defined as:is called the long-run variance of .

## Multivariate generalizations

The results illustrated above for sequences of random variables extend in a straightforward manner to sequences of random vectors. For example, the multivariate version of the Lindeberg-Lévy CLT is:

Proposition (Multivariate Lindeberg-Lévy CLT) Let be an IID sequence of random vectors such that:where is a positive definite matrix. Let be the vector of sample means. Then:where is a standard multivariate normal random vector and denotes convergence in distribution.

Proof

For a proof see, for example, Basu (2004), DasGupta (2008) and McCabe and Tremayne (1993).

In a similar manner, the CLT for correlated sequences generalizes to random vectors ( becomes a matrix, called long-run covariance matrix).

## Solved exercises

Below you can find some exercises with explained solutions:

1. Exercise set 1 (use the Central Limit Theorem to find mean and variance of approximating normal distributions).

## References

Basu, A. K. (2004) Measure theory and probability, PHI Learning PVT.

DasGupta, A. (2008) Asymptotic theory of statistics and probability, Springer.

Durrett, R. (2010) Probability: theory and examples, Cambridge University Press.

McCabe, B. and A. Tremayne (1993) Elements of modern asymptotic theory with statistical applications, Manchester University Press.

Resnick, S. I. (1999) A probability path, Birkhauser.

White, H. (2001) Asymptotic theory for econometricians, Academic Press.

Williams, D. (1991) Probability with martingales, Cambridge University Press.

The book

Most learning materials found on this website are now available in a traditional textbook format.

Glossary entries
Share