Point estimation

Point estimation is a type of statistical inference which consists in producing a guess or approximation of an unknown parameter.

In this lecture we introduce the theoretical framework that underlies all point estimation problems.

At the end of the lecture, we provide links to detailed examples of point estimation, in which we show how to apply the theory.

Table of contents

Sample and data-generating distribution
Parametric model
Estimate and estimator
Estimation error
Loss
Risk
Estimates of risk
Risk minimization
Common risk measures
Other criteria to evaluate estimators
1. Unbiasedness
2. Consistency
Examples
How to find a point estimator
Point vs interval estimation

Sample and data-generating distribution

The main elements of a point estimation problem are those found in any statistical inference problem:

we have a sample that has been drawn from a probability distribution whose characteristics are at least partly unknown;
the sample is regarded as the realization of a random vector ;
the joint distribution function of , denoted by , is assumed to belong to a set of distribution functions , called statistical model.

Parametric model

When the model is put into correspondence with a set of real vectors, then we have a parametric model.

The set is called the parameter space and its elements are called parameters.

Denote by $heta _{0}$ the parameter that is associated with the data-generating distribution and assume that $heta _{0}$ is unique. The vector $heta _{0}$ is called the true parameter.

Estimate and estimator

Point estimation is the act of choosing a vector that approximates $heta _{0}$ . The approximation is called an estimate (or point estimate) of $heta _{0}$ .

When the estimate is produced using a predefined rule (a function) that associates a parameter estimate to each in the support of , we can write

The function is called an estimator.

Often, the symbol is used to denote both the estimate and the estimator. The meaning is usually clear from the context.

Estimation error

According to the decision-theoretic terminology introduced previously, making an estimate is an act, which produces consequences.

Among these consequences, the most relevant one is the estimation error

The statistician's goal is to commit the smallest possible estimation error.

Loss

The preference for small errors can be formalized with a loss function that quantifies the loss incurred by estimating $heta _{0}$ with .

Examples of loss functions are:

the absolute error:where is the Euclidean norm (it coincides with the absolute value when );
the squared error:

Risk

When the estimate is obtained from an estimator, it is a function of the random vector and the loss is a random variable.

The expected value of the lossis called the statistical risk (or, simply, the risk) of the estimator .

Estimates of risk

The expected value in the definition of risk is computed with respect to the true distribution function .

Therefore, we can compute the risk only if we know the true parameter $heta _{0}$ and .

When $heta _{0}$ and are unknown, the risk needs to be estimated.

For example, we can approximate the risk with the quantity where:

we pretend that the estimate is the true parameter;
we denote the estimator of by
we compute the expected value with respect to the estimated distribution function .

Even if the risk is unknown, the notion of risk is often used to derive theoretical properties of estimators.

Risk minimization

Point estimation is always guided, at least ideally, by the principle of risk minimization, that is, by the search for estimators that minimize the risk.

Common risk measures

Depending on the specific loss function we use, the statistical risk of an estimator can take different names:

when the absolute error is used as a loss function, then the riskis called the Mean Absolute Error (MAE) of the estimator.
when the squared error is used as a loss function, then the riskis called Mean Squared Error (MSE). The square root of the mean squared error is called root mean squared error (RMSE).

Other criteria to evaluate estimators

In this section we discuss other criteria that are commonly used to evaluate estimators.

Unbiasedness

If an estimator produces parameter estimates that are on average correct, then it is said to be unbiased.

The following is a formal definition.

Definition Let $heta _{0}$ be the true parameter. An estimator is an unbiased estimator of $heta _{0}$ if and only ifIf an estimator is not unbiased, then it is called a biased estimator.

If an estimator is unbiased, then the estimation error is on average zero: [eq25]

Consistency

If an estimator produces parameter estimates that converge to the true value when the sample size increases, then it is said to be consistent.

The following is a formal definition.

Definition Let be a sequence of samples such that all the distribution functions are put into correspondence with the same parameter $heta _{0}$ . A sequence of estimators is said to be consistent (or weakly consistent) if and only if [eq29] where indicates convergence in probability. The sequence of estimators is said to be strongly consistent if and only ifwhere indicates almost sure convergence. A sequence of estimators which is not consistent is called inconsistent.

When the sequence of estimators is obtained using the same predefined rule for every sample $xi _{n}$ , we often say, with a slight abuse of language, "consistent estimator" instead of saying "consistent sequence of estimators". In such cases, what we mean is that the predefined rule produces a consistent sequence of estimators.

Examples

You can find detailed examples of point estimation in the lectures on:

How to find a point estimator

The methods to find point estimators are called estimation methods.

You can read about these methods here:

Point vs interval estimation

There is another kind of estimation, called set estimation or interval estimation.

While in point estimation we produce a single estimate meant to approximate the true parameter, in set estimation we produce a whole set of estimates meant to include the true parameter with high probability.

How to cite

Please cite as:

Taboga, Marco (2021). "Point estimation", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/point-estimation.