The plug-in principle

The plug-in principle is a technique used in probability theory and statistics to approximately compute or to estimate a feature of a probability distribution (e.g., the expected value, the variance, a quantile) that cannot be computed exactly. It is widely used in the theories of Monte Carlo simulation and bootstrapping.

Roughly speaking, the plug-in principle says that a feature of a given distribution can be approximated by the same feature of the empirical distribution of a sample of observations drawn from the given distribution. The feature of the empirical distribution is called a plug-in estimate of the feature of the given distribution. For example, a quantile of a given distribution can be approximated by the analogous quantile of the empirical distribution of a sample of draws from the given distribution.

Table of contents

Definition
Asymptotic properties
References

Definition

The following is a formal definition of plug-in estimate.

Definition Let be a set of distribution functions. Let be a mapping . Let . Let [eq2] be a sample of realizations of random variables , ..., all having distribution function . Let $F_{n}$ be the empirical distribution function of the sample. If $F_{n}in Phi$ , then the quantity is called a plug-in estimate of .

The next section will provide an informal discussion of the conditions under which converges to as the sample size increases. Before doing that, we will provide some examples to clarify the meaning of the mapping . The next example shows how the plug-in principle can be used to approximate expected values.

Example Suppose we need to compute the expected value where is a random variable and is a function. If is the distribution function of , then the expected value can be written as a Riemann-Stieltjes integral (see the lecture entitled Expected value) as follows [eq8] We can define a mapping such that, for any distribution function , we have [eq9] Thus, the expected value we need to compute is [eq10] Now, if we have a sample of draws $x_{1}$ , ..., $x_{n}$ from the distribution , their empirical distribution function $F_{n}$ is the distribution function of a discrete random variable that can take any of the values $x_{1}$ , ..., $x_{n}$ with probability . As a consequence, the plug-in estimate of is [eq12]

The next example shows how the plug-in principle can be used to approximate quantiles.

Example Suppose that we need to compute the -quantile of a random variable having distribution function , and suppose that we are not able to compute it by using the definition of -quantileWe can define a mapping such that, for any distribution function , we haveThus, the quantile we need to compute isIf we have a sample of draws $x_{1}$ , ..., $x_{n}$ from the distribution , and we denote by $F_{n}$ their empirical distribution function, then the plug-in estimate of is [eq19] where is the ceiling of , that is, the smallest integer not less than , and is the -th order statistic of the sample, that is, the -th smaller observation in the sample.

Asymptotic properties

We will not go into the details of the asymptotic properties of plug-in estimators because this would require a level of mathematical sophistication far beyond the level required on average in these lecture notes. However, we will discuss the main issues related to their convergence and provide some intuition.

First of all, one may wonder whether the plug-in estimateconverges in some sense toas the sample size increases. As we have seen in the lecture entitled Empirical distribution, the Glivenko-Cantelli theorem and its generalizations provide sets of conditions under which the empirical distribution $F_{n}$ converges to . If $F_{n}$ and were finite-dimensional vectors, then one could apply the Continuous Mapping theorem and say that if is continuous, then converges to . Unfortunately, $F_{n}$ and are not finite-dimensional because they are functions defined on (which is uncountable), and, as a consequence, it is not possible to apply the Continuous Mapping Theorem. However, there are several theorems, analogous to the Continuous Mapping theorem, that can be applied in the case of plug-in estimators: if the mapping is continuous in some sense (or differentiable), then converges to . The continuity conditions required in these theorems are often complicated and difficult to check in practical cases. We refer the interested reader to van der Vaart (2000) for more details. Rest assured, however, that the most commonly used mappings (e.g., mean, variance, moments and cross-moments, quantiles) satisfy the required continuity conditions.

Furthermore, it is also possible to prove that, under certain conditions, a version of the Central Limit Theorem applies to the plug-in estimate , that is, the quantityconverges in distribution to a normal random variable. If $F_{n}$ and were finite-dimensional vectors, then one could require that be differentiable and apply the Delta Method to prove the asymptotic normality of the above quantity. But since $F_{n}$ and are infinite dimensional, a more general technique, called Functional Delta Method, needs to be employed (which utilizes a notion of differentiability for that is called Hadamard differentiability). Again, we refer the interested reader to van der Vaart (2000) for more details.

References

van der Vaart, A. W. (2000) Asymptotic statistics, Cambridge University Press.

How to cite

Please cite as:

Taboga, Marco (2021). "The plug-in principle", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/asymptotic-theory/plug-in-principle.