Law of Large Numbers

A Law of Large Numbers (LLN) is a proposition that provides a set of sufficient conditions for the convergence of the sample mean to a constant.

Typically, the constant is the expected value of the distribution from which the sample has been drawn.

The sample mean

Let be a sequence of random variables.

Let be the sample mean of the first terms of the sequence:

A Law of Large Numbers (LLN) states some conditions that are sufficient to guarantee the convergence of to a constant, as the sample size increases.

Typically, all the random variables in the sequence have the same expected value . In this case, the constant to which the sample mean converges is (which is called population mean).

But there are also Laws of Large Numbers in which the terms of the sequence are not required to have the same expected value. In these cases, which are not treated in this lecture, the constant to which the sample mean converges is an average of the expected values of the individual terms of the sequence .

There are literally dozens of LLNs. We report some important examples below (road map in the figure).

Weak Laws

A LLN is called a Weak Law of Large Numbers (WLLN) if the sample mean converges in probability.

The adjective weak is used because convergence in probability is often called weak convergence. It is employed to make a distinction from Strong Laws of Large Numbers, in which the sample mean is required to converge almost surely.

Chebyshev's Weak Law of Large Numbers

One of the best known WLLNs is Chebyshev's.

Proposition (Chebyshev's WLLN) Let be an uncorrelated and covariance stationary sequence:Then, a Weak Law of Large Numbers applies to the sample mean:where denotes a probability limit.

Proof

The expected value of the sample mean isThe variance of the sample mean isNow we can apply Chebyshev's inequality to the sample mean :for any (i.e., for any strictly positive real number ). Plugging in the values for the expected value and the variance derived above, we obtainSinceandthen it must be that alsoNote that this holds for any arbitrarily small . By the very definition of convergence in probability, this means that converges in probability to (if you are wondering about strict and weak inequalities here and in the definition of convergence in probability, note that implies for any strictly positive ).

Note that it is customary to state Chebyshev's Weak Law of Large Numbers as a result on the convergence in probability of the sample mean:

However, the conditions of the above theorem guarantee the mean square convergence of the sample mean to :

Proof

In the above proof of Chebyshev's WLLN, it is proved thatand that This implies thatAs a consequence,but this is just the definition of mean square convergence of to .

Hence, in Chebyshev's WLLN, convergence in probability is just a consequence of the fact that convergence in mean square implies convergence in probability.

Chebyshev's Weak Law of Large Numbers for correlated sequences

Chebyshev's WLLN sets forth the requirement that the terms of the sequence have zero covariance with each other. By relaxing this requirement and allowing for some correlation between the terms of the sequence , a more general version of Chebyshev's Weak Law of Large Numbers can be obtained.

Proposition (Chebyshev's WLLN for correlated sequences) Let be a covariance stationary sequence of random variables:If covariances tend to be zero on average, that is, ifthen a Weak Law of Large Numbers applies to the sample mean:

Proof

For a full proof see, e.g., Karlin and Taylor (1975). We give here a proof based on the assumption that covariances are absolutely summable:which is a stronger assumption than the assumption made in the proposition that covariances tend to be zero on average. The expected value of the sample mean isThe variance of the sample mean isNote that But the covariances are absolutely summable, so thatwhere is a finite constant. Therefore,Now we can apply Chebyshev's inequality to the sample mean :for any (i.e., for any strictly positive real number ). Plugging in the values for the expected value and the variance derived above, we obtainSinceandthen it must be that alsoNote that this holds for any arbitrarily small . By the definition of convergence in probability, this means that converges in probability to (if you are wondering about strict and weak inequalities here and in the definition of convergence in probability, note that implies for any strictly positive ).

Chebyshev's Weak Law of Large Numbers for correlated sequences has been stated as a result on the convergence in probability of the sample mean:

However, the conditions of the above theorem also guarantee the mean square convergence of the sample mean to :

Proof

In the above proof of Chebyshev's Weak Law of Large Numbers for correlated sequences, we proved thatand that This impliesThus, taking limits on both sides, we obtainBut so it must be thatThis is just the definition of mean square convergence of to .

Hence, also in Chebyshev's Weak Law of Large Numbers for correlated sequences, convergence in probability descends from the fact that convergence in mean square implies convergence in probability.

Strong Laws

A LLN is called a Strong Law of Large Numbers (SLLN) if the sample mean converges almost surely.

The adjective Strong is used to make a distinction from Weak Laws of Large Numbers, where the sample mean is required to converge in probability.

Kolmogorov's Strong Law of Large Numbers

Among SLLNs, Kolmogorov's is probably the best known.

Proposition (Kolmogorov's SLLN) Let be an iid sequence of random variables having finite mean:Then, a Strong Law of Large Numbers applies to the sample mean:where denotes almost sure convergence.

Proof

See, for example, Resnick (1999) and Williams (1991).

Ergodic theorem

In Kolmogorov's SLLN, the sequence is required to be an iid sequence. This requirement can be weakened, by requiring to be stationary and ergodic.

Proposition (Ergodic Theorem) Let be a stationary and ergodic sequence of random variables having finite mean:Then, a Strong Law of Large Numbers applies to the sample mean:

Proof

See, for example, Karlin and Taylor (1975) and White (2001).

Laws of Large Numbers for random vectors

The LLNs we have just presented concern sequences of random variables. However, they can be extended in a straightforward manner to sequences of random vectors.

Proposition Let be a sequence of random vectors, let be their common expected value andtheir sample mean. Denote the -th component of by and the -th component of by . Then:

• a Weak Law of Large Numbers applies to the sample mean if and only if a Weak Law of Large numbers applies to each of the components of the vector , that is, if and only if

• a Strong Law of Large Numbers applies to the sample mean if and only if a Strong Law of Large numbers applies to each of the components of the vector , that is, if and only if

Proof

This is a consequence of the fact that a vector converges in probability (almost surely) if and only if all of its components converge in probability (almost surely). See the lectures entitled Convergence in probability and Almost sure convergence.

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Let be an IID sequence.

A generic term of the sequence has mean and variance .

Let be a covariance stationary sequence such that a generic term of the sequence satisfieswhere .

Denote by the sample mean of the sequence.

Verify whether the sequence satisfies the conditions that are required by Chebyshev's Weak Law of Large Numbers. In the affirmative case, find its probability limit.

Solution

By assumption the sequence is covariance stationary. So all the terms of the sequence have the same expected value. Taking the expected value of both sides of the equationwe obtainSolving for , we obtainBy the same token, the variance can be derived fromwhich, solving for , yieldsNow, we need to derive . Note thatThe covariance between two terms of the sequence isThe sum of the covariances isThus, covariances tend to be zero on average:and the conditions of Chebyshev's Weak Law of Large Numbers are satisfied. Therefore, the sample mean converges in probability to the population mean:

References

Karlin, S. and H. E. Taylor (1975) A first course in stochastic processes, Academic Press.

Resnick, S. I. (1999) A probability path, Birkhauser.

White, H. (2001) Asymptotic theory for econometricians, Academic Press.

Williams, D. (1991) Probability with martingales, Cambridge University Press.