Search for probability and statistics terms on Statlect
StatLect

The plug-in principle

by , PhD

The plug-in principle is a technique used in probability theory and statistics to approximately compute or to estimate a feature of a probability distribution (e.g., the expected value, the variance, a quantile) that cannot be computed exactly. It is widely used in the theories of Monte Carlo simulation and bootstrapping.

Roughly speaking, the plug-in principle says that a feature of a given distribution can be approximated by the same feature of the empirical distribution of a sample of observations drawn from the given distribution. The feature of the empirical distribution is called a plug-in estimate of the feature of the given distribution. For example, a quantile of a given distribution can be approximated by the analogous quantile of the empirical distribution of a sample of draws from the given distribution.

Table of Contents

Definition

The following is a formal definition of plug-in estimate.

Definition Let $Phi $ be a set of distribution functions. Let $T$ be a mapping [eq1]. Let $Fin Phi $. Let[eq2]be a sample of realizations of n random variables X_1, ..., X_n all having distribution function F. Let $F_{n}$ be the empirical distribution function of the sample. If $F_{n}in Phi $, then the quantity[eq3] is called a plug-in estimate of $Tleft( F
ight) $.

The next section will provide an informal discussion of the conditions under which [eq4] converges to $Tleft( F
ight) $ as the sample size n increases. Before doing that, we will provide some examples to clarify the meaning of the mapping $T$. The next example shows how the plug-in principle can be used to approximate expected values.

Example Suppose we need to compute the expected value [eq5]where X is a random variable and [eq6] is a function. If [eq7] is the distribution function of X, then the expected value can be written as a Riemann-Stieltjes integral (see the lecture entitled Expected value) as follows[eq8]We can define a mapping $T$ such that, for any distribution function $arphi $, we have[eq9]Thus, the expected value we need to compute is[eq10]Now, if we have a sample of n draws $x_{1}$, ..., $x_{n}$ from the distribution F, their empirical distribution function $F_{n}$ is the distribution function of a discrete random variable that can take any of the values $x_{1}$, ..., $x_{n}$ with probability $1/n$. As a consequence, the plug-in estimate of [eq11] is[eq12]

The next example shows how the plug-in principle can be used to approximate quantiles.

Example Suppose that we need to compute the p-quantile [eq13] of a random variable X having distribution function [eq7], and suppose that we are not able to compute it by using the definition of p-quantile[eq15]We can define a mapping $T$ such that, for any distribution function $arphi $, we have[eq16]Thus, the quantile we need to compute is[eq17]If we have a sample of n draws $x_{1}$, ..., $x_{n}$ from the distribution F, and we denote by $F_{n}$ their empirical distribution function, then the plug-in estimate of [eq18] is [eq19]where [eq20] is the ceiling of $np$, that is, the smallest integer not less than $np$, and [eq21] is the i-th order statistic of the sample, that is, the i-th smaller observation in the sample.

Asymptotic properties

We will not go into the details of the asymptotic properties of plug-in estimators because this would require a level of mathematical sophistication far beyond the level required on average in these lecture notes. However, we will discuss the main issues related to their convergence and provide some intuition.

First of all, one may wonder whether the plug-in estimate[eq22]converges in some sense to[eq23]as the sample size n increases. As we have seen in the lecture entitled Empirical distribution, the Glivenko-Cantelli theorem and its generalizations provide sets of conditions under which the empirical distribution $F_{n}$ converges to F. If $F_{n}$ and F were finite-dimensional vectors, then one could apply the Continuous Mapping theorem and say that if $T$ is continuous, then [eq24] converges to $Tleft( F
ight) $. Unfortunately, $F_{n}$ and F are not finite-dimensional because they are functions defined on R (which is uncountable), and, as a consequence, it is not possible to apply the Continuous Mapping Theorem. However, there are several theorems, analogous to the Continuous Mapping theorem, that can be applied in the case of plug-in estimators: if the mapping $T$ is continuous in some sense (or differentiable), then [eq25] converges to $Tleft( F
ight) $. The continuity conditions required in these theorems are often complicated and difficult to check in practical cases. We refer the interested reader to van der Vaart (2000) for more details. Rest assured, however, that the most commonly used mappings $T$ (e.g., mean, variance, moments and cross-moments, quantiles) satisfy the required continuity conditions.

Furthermore, it is also possible to prove that, under certain conditions, a version of the Central Limit Theorem applies to the plug-in estimate [eq26], that is, the quantity[eq27]converges in distribution to a normal random variable. If $F_{n}$ and F were finite-dimensional vectors, then one could require that $T$ be differentiable and apply the Delta Method to prove the asymptotic normality of the above quantity. But since $F_{n}$ and F are infinite dimensional, a more general technique, called Functional Delta Method, needs to be employed (which utilizes a notion of differentiability for $T$ that is called Hadamard differentiability). Again, we refer the interested reader to van der Vaart (2000) for more details.

References

van der Vaart, A. W. (2000) Asymptotic statistics, Cambridge University Press.

How to cite

Please cite as:

Taboga, Marco (2021). "The plug-in principle", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/asymptotic-theory/plug-in-principle.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.